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Title: 

Perfect score: 
Sequence : 

Scoring table: 



cotein search, using sw model 

December 18, 2006, 19:53:34 ; Search time 199 Seconds 
(without alignments) 
1295.829 Million cell updates/sec 

US-10-807-746-7 
2937 

1 MLRLNLRFLS FLLC I SQSVE KMTPFGSLDFSTLYFIQEKH 564 



Searched: 2589679 seqs, 457216429 residues 

Total number of hits satisfying chosen parameters: 



I length: 2000000000 



.-processing: Mir 



i Match 100% 



gene. 



>qp20f 



>eqp2001j 
geneseqp2002s: 
geneseqp2003as : * 
geneseqp2003bs : * 
geneseqp2004s : * 
geneseqp2005s: * 
): geneseqp2006s : + 



isults predicted by c 
o the score of the r 
: the total score dis 



[ 100. "0 

2937 
1023.5 

1021 34.8 
1010.5 34.4 
764.5 26.0 



DT0S582 
ADT51367 
AB061169 
ADF04606 
ABM69129 
ABU30402 



Description 



Adt05582 Haemophil 
Adt51367 Non-typea 
Abo61169 Klebsiell 
Adf04606 Bacterial 
Abm69129 Photorhab 
Abu30402 Protein e 
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664.5 
649.5 
648.5 
633 
483.5 
483.5 
483.5 



540 8 ADT05677 
542 6 ABU21963 

574 6 ABU22049 
537 7 AB063268 

546 6 ABU32103 
551 7 AB064532 
535 9 AED82141 
535 9 AED83036 
535 6 ABU15032 
535 9 ADZ77674 
535 10 AEE97853 
535 10 AEE97710 

535 10 AEF18284 

536 6 ABM70366 
535 6 ABU47545 

541 6 ABU39871 
517 6 ABU45336 

532 6 ABU39012 
535 4 AAU38208 
535 6 ABU27536 

547 6 ABU22013 
535 6 ABD50401 
535 6 ABU41262 
539 7 ADF06592 
547 6 ABU41922 

537 4 AAU36431 
537 6 AB038751 
621 7 AB077668 

533 6 ABU38752 
549 7 AB077581 
563 7 AB077669 
555 7 ABO75310 

1898 4 ABG25514 

575 7 AB077582 
549 2 AAW98830 



Aaol7804 H influen 
Abul9675 Protein e 
Adt05677 Haemophil 
Abu21963 Protein e 
Abu22049 Protein e 
Abo63268 Klebsiell 
Abu32103 Protein e 
Abo64532 Klebsiell 
Aed82141 Hyperiiranu 
Aed83036 Hyperiiranu 
Abul5032 Protein e 
Adz77674 Escherich 
Aee97853 Escherich 
Aee97710 Escherich 
Aefl8284 Dipeptide 
Abm70366 Photorhab 
Abu47545 Protein e 
Abu39871 Protein e 
Abu45336 Protein e 
Abu39012 Protein e 
Aau38208 Salmonell 
Abu27536 Protein e 
Abu22013 Protein e 
Abu50401 Protein e 
Abu41262 Protein e 
Adf06592 Bacterial 
Abu41922 Protein e 
Aau36431 Pseudomon 
Abu38751 Protein e 
Abo77668 Pseudomon 
Abu38752 Protein e 
Abo77581 Pseudomon 
Abo77669 I 
Abo75310 I 
Abg25514 Novel hum 
Abo77582 Pseudomon 
Aaw98830 H. pylori 
Aau35720 Helicobac 
Abu30753 Protein e 



ALIGNMENTS 



RESULT 1 
ADT05582 

ID ADT05582 standard; protein; 564 AA. 
XX 

AC ADT05582; 
XX 

DT 02-DEC-2004 (first entry) 
XX 

DE Haemophilus influenzae (NTHi) protein - SEQ ID 618. 
XX 

KW middle ear bacterial infection; nasopharynx bacterial infection. 
XX 

OS Haemophilus influenzae, 
xx 

PN WO2004078949-A2. 
XX 

PD 16-SEP-2004. 

PF 05-MAR-2004; 2004WO-US007001 . 
XX 

PR 06-MAR-2003; 2003US-0453134P. 

PA (CHIL-) CHILDRENS HOSPITAL INC. 

PI Bakaletz LO, Munson RS, Dyer DW; 




New polynucleotides of nontypeable strain of Haemophilus influenzae, 
useful for treating or preventing NTHi bacterial infections of the middle 
ear and/or nasopharynx. 
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PS Claim 3; SEQ ID NO 618; 88pp; English. 
XX 

CC The invention comprises nucleotide sequences (genes) from the genome of a 

CC nontypeable strain of Haemophilus influenzae (NTHi) . The NTHi DNA 

CC sequences of the invention are useful for treating or preventing NTHi 

CC bacterial infections of the middle ear and/or nasopharynx. The present 

CC amino acid sequence represents an NTHi protein of the invention. 

XX 

SQ Sequence 564 AA; 

Query Match 100.0%; Score 2937; DB 8; Length 564; 
Best Local Similarity 100.0%; Pred. No. 1.2e-253; 

Matches 564; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MLRLNLRFLSFLLCISQSVELQAAPSVPTFLTENGLTYCTHASGFSFNPQTADAGTSMNV 60 

Db 1 MLRLNLRFLSFLLCISQSVELQAAPSVPTFLTENGLTYCTHASGFSFNPQTADAGTSMNV 60 

Qy 61 VTEQIYNKLFDIKNHSATLTPMLAQSYSISADGKEILLNLRHGVKFHQTPWFTPTRDFNA 120 

Db 61 VTEQIYNKLFDIKNHSATLTPMLAQSYSISADGKEILLNLRHGVKFHQTPWFTPTRDFNA 120 

Qy 121 EDVVFSINRVLGHNTYLPTLAEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIKSVT 180 

Db 121 EDVVFSINRVLGHNTYLPTLAEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIKSVT 180 

Qy 181 ALSPYQVKIELFAPDSSILSHLASQYAII FSQEYAYQLSADDNLAQLDTHPVGTGPYQVK 240 

Db 181 ALSPYQVKIELFAPDSSILSHLASQYAIIFSQEYAYQLSADDNLAQLDTHPVGTGPYQVK 240 

Qy 241 DYVYNQYVRLVRNENYWKKEAKIEHIIVDLSTDRSGRLVKFFNNECQIASYPEVSQIGLL 300 

Db 241 DYVYNQYVRLVRNENYWKKEAKIEHI IVDLSTDRSGRLVKFFNNECQIASYPEVSQIGLL 300 

Qy 301 KNDDKHYYMQSTDGMNLAYLAFNFDKPLMRDHEIRAAISQSLNRARIIHSIYHNTATVAN 360 

Db 301 KNDDKHYYMQSTDGMNLAYLAFT^FDKPLMRDHEIRAAISQSLNRARIIHSIYHNTATVAN 360 

Qy 361 MIIPEVSWASTVNTPEFEFDYHPKIAKNKLADKNLLLNLWVINEEQVYNPAPFKMAEMIK 420 

Db 361 NIIPEVSWASTVNTPEFEFDYHPKIAKNKLADKNLLLNLWVINEEQVYNPAPFKMAEMIK 420 

Qy 421 WDLAQAGVKVKVRAVTRPFLTAQLRNQSENYDLILSGWLAGNLDPDGFMRPILSCGTKNE 480 

I I I I I I I I I I I I I 1 I I I I I I I II I I I I I I I I I 

Db 421 WDLAQAGVKVKVRAVTRPFLTAQLRNQSENYDLILSGWLAGNLDPDGFMRPILSCGTKNE 480 

Qy 481 LTNLSNWCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPIIPIANVKRILVANSR 540 

I 1 I I I I I I I I I I I 

Db 481 LTNLSNWCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPII PIANVKRILVANSR 540 

Qy 541 VKGVKMTPFGSLDFSTLYFIQEKH 564 

I I I I I I I I I I 

Db 541 VKGVKMTPFGSLDFSTLYFIQEKH 564 



RESULT 2 
ADT51367 

ID ADT51367 standard; protein; 564 AA. 
XX 

AC ADT51367; 
XX 

DT 30-DEC-2004 (first entry) 
XX 

DE Non-typeable Haemophilus influenzae 



Lai; auditory; antiinflammatory; antiarthritic ; gene therapy; 
.agnosis; NTHi bacterial infection; otitis media; pneumonia; 
septic arthritis; meningitis. 



Haemophilus influc 
WO2004087749-A2. 
14-OCT-2004 . 
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PF 24-MAR-2004; 2004WO-US009021. 
XX 

PR 27-MAR-2003; 2003US-04 58234 P . 
XX 

PA (CHIL-) CHILDRENS HOSPITAL INC. 
XX 

PI Bakaletz LO, Munson RS; 



PT New nontypeable strain of Haemophilus influenzae (NTHi ) genes and 

PT polypeptides for diagnosing, preventing or treating NTHi bacterial 

PT infections, such as otitis media, pneumonia, sinusitis, septic arthritis 

PT or meningitis. 
XX 

PS Claim 5; SEQ ID NO 7; 93pp; English. 
XX 

~C The invention relates to an isolated polynucleotide comprising any of the 

=C 7 fully defined sequences of 810-2979 bp given in the specification. The 

:C encoded polypeptide comprises any of the 7 fully defined sequences of 269 

:c -992 amino acids given in the specification. The composition and methods 

ZC are useful for diagnosing, preventing or treating NTHi bacter.-- 1 

:c infections, such as otitis media, pneumonia, sinusitis, septic 

X or meningitis. This sequence corresponds to a protein from Haemophilus 

:c influenzae used in the invention. 
<X 

3Q Sequence 564 AA; 

Query Match 100.0%; Score 2937; DB 8; Length 564; 

Best Local Similarity 100.0%; Pred. No. 1.2e-253; 

Matches 564; Conservative 0; Mismatches 0; Indels 0; Gaps 

2V 1 MLRLNLRFLSFLLCISQSVELQAAPSVPTFLTENGLTYCTHASGFSFNPQTADAGTSMNV 60 

, I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

3b 1 MLRLNLRFLSFLLCISQSVELQAAPSVPTFLTENGLTYCTHASGFSFNPQTADAGTSMNV 60 

2y 61 VTEQIYNKLFDIKNHSATLTPMLAQSYSISADGKEILLNLRHGVKFHQTPWFTPTRDFNA 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | M | | I I I I I I I I I I I I I I I I I I | | 
)b 61 VTEQIYNKLFDIKNHSATLTPMLAQSYSISADGKEILLNLRHGVKFHQTPWFTPTRDFNA 120 

121 EDVVFSINRVLGHNTYLPTLAEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIKSVT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I HI 

* 121 EDVVFSINRVLGHNTYLPTLAEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIKSVT 180 

iy 181 ALSPYQVKIELFAPDSSILSHLASQYAIIFSQEYAYQLSADDNLAQLDTHPVGTGPYQVK 240 

,b 181 ALSPYQVKIELFAPDSSILSHLASQYAIIFSQEYAYQLSADDNLAQLDTHPVGTGPYQVK 240 

!y 2 41 DYVYNQYVRLVRNENYWKKEAKIEHIIVDLSTDRSGRLVKFFNNECQIASYPEVSQIGLL 300 

I I I I I I I I I I I I M I I I | 

lb 241 DYVYNQYVRLVRNENYWKKEAKIEHIIVDLSTDRSGRLVKFFNNECQIASYPEVSQIGLL 300 

>y 301 KNDDKHYYMQSTDGMNLAYLAFNFDKPLMRDHEIRAAISQSLNRARIIHSIYHNTATVAN 360 

>b 301 KNDDKHYYMQSTDGMNLAYLAFNFDKPLMRDHEIRAAISQSLNRARI IHSIYHNTATVAN 360 

!y 361 NIIPEVSWASTVNTPEFEFDYHPKIAKNKLADKNLLLNLWVINEEQVYNPAPFKMAEMIK 420 

I II II I I III | | I I II III II I I II I III I I || Ml III 

to 361 NIIPEVSWASTVNTPEFEFDYHPKIAKNKLADKNLLLNLWVINEEQVYNPAPFKMAEMIK 420 

!y 421 WDLAQAGVKVKVRAVTRPFLTAQLRNQSENYDLILSGWLAGNLDPDGFMRPILSCGTKNE 480 

b 421 WDLAQAGVKVKVPAVTRPFLTAQLRNQSENYDLILSGWLAGNLDPDGFMRPILSCGTKNE 480 

y 481 LTNLSNWCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPIIPIANVKRILVANSR 540 

MIIHII I IMII Illlllll Illlllllll 

b 481 LTNLSNWCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPIIPIANVKRILVANSR 54 0 

y 541 VKGVKMTPFGSLDFSTLYFIQEKH 564 

I I I I I I I I I I I I I I I I I I I I I I I I 
b 541 VKGVKMTPFGSLDFSTLYFIQEKH 564 
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ID AB061169 standard; protein; 558 AA. 
XX 

AC AB061169; 

XX 

DT 29-JUL-2004 (first entry) 
XX 

DE Klebsiella pneumoniae polypeptide seqid 7686. 
XX 

KW Recombinant expression vector; transcription regulatory element; 

KW Klebsiella pneumoniae protein; antibacterial; Vaccine. 

XX 

OS Klebsiella pneumoniae. 
XX 

PN US6610836-B1. 
XX 

PD 26-AUG-2003. 
XX 

PF 27-JAN-2000; 2000US-00489039 . 
XX 

PR 29-JAN-1999; 99US-0117747P. 
XX 

PA (GENO-) GENOME THERAPEUTICS CORP. 
XX 

PI Breton GL, Osborne M; 
XX 

DR WPI; 2003-895346/82. 

DR N-PSDB; ACH94720. 

PT New nucleic acid encoding a Klebsiella pneumoniae polypeptide, useful for 

PT preparing a vaccine composition against Klebsiella pneumoniae. 

PS Disclosure; SEQ ID NO 7686; 932pp; English. 
XX 

CC The invention describes a new isolated nucleic acid encoding a Klebsiella 

CC pneumoniae polypeptide. Also described are: a recombinant expression 

CC vector comprising the nucleic acid, operably linked to a transcription 

CC regulatory element; and a cell comprising the recombinant expression 

CC vector. The nucleic acid is useful for preparing a vaccine composition 

CC against Klebsiella pneumoniae. This is the amino acid sequence of a 

CC Klebsiella pneumoniae polypeptide of the invention 
XX 

SQ Sequence 558 AA; 

Query Match 34.8%; Score 1023.5; DB 7; Length 558; 

Best Local Similarity 38.8%; Pred. No. 3e-82; 

Matches 209; Conservative 106; Mismatches 194; Indels 29; Gaps 7; 
Qy 23 AAPSVP--TFLTENGLTYCTHASGFSFNPQTADAGTSMNVVTEQIYNKLFDIKNHSATLT 80 

Db 32 AAPALPDRADIRDSGFVYCVSGQVNTFNPQKVSSGLIVDTLAAQIYDRLLDVDPYTYRLV 91 

QY 81 PMLAQSYSISADGKEILLNLRHGVKFHQTPWFTPTRDFNAEDVVFSINRVLGHNTYLPTL 140 

Db 92 PELAESWEVLDNGATYRFHLRRHVPFQRTAWFTPTRDFNADDVIFTFGRIFNRD 145 

Qy 141 AEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIKSVTALSPYQVKIELFAPDSSILS 200 

Db 146 HPWHNV NGSSFPYFDSLQFADSVESVRKLDNQTVEFRLKRPDASFLW 192 

Qy 201 HLASQYAIIFSQEYAYQLSADDNLAQLDTHPVGTGPYQVKDYVYNQYVRLVRNENYWKKE 260 

Db 193 HLATHYASITSAEYAARLTQDDRQEQLDRQPVGTGPFQLSDYRSGQYVRLQRHPGYWRGK 252 

QY 261 AKIEHIIVDLSTDRSGRLVKFFNNECQIASYPEVSQIGLLKNDDKHYYMQSTDGMNLAYL 320 

Db 25 3 PLMPQVVVDLGSGGTGRLSKLLTGECDVLAWPAASQLTILR-DDPRLRLTLRPGMNIAWL 311 

Qy 321 AFNFDKPLMRDHEIRAAISQSLNRARIIHSI YHNTATVANNIIPEVSWASTVNTPEFEFD 380 

Db 312 AFNTAKPPLDNPEVRHALALAINNQRLMQSIYYGTAETAASMLPRASWAYDNDAKITE-- 369 

Qy 381 YHPKIAKNKLAD— KNLLLNLWVINEEQVYNPAPFKMAEMIKWDLAQAGVKVKVRAVTR 437 

Db 370 YNPQEARARLKALGLENLTLKLWVPTSSQAWNPSPLKTAELIQADMAQIGVKVI IVPVEG 429 

QY ^38 PFLTAQLRNQSENYDLILSGWLAGNLDPDGFMRPILSCGTKNELTNLSNWCNEEFDQFMD 4 97 
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I I : I : I : I I I I I I : I I I I I I : I I I I I :: I I I I I I : 

4 30 RFQEARLMDMS--HDLTLSGWATDSNDPDSFFRPLLSCAAIASQTNFAHWCNREFDDVLQ 4 87 

498 RAITTSHLSSRAKAYNEAQELVLRELPIIPIANVKRILVANSRVKGVKMTPFGSLDFS 555 

488 KALLSQQLSSRMDAYKEAQRILARELPVLPLASSLRLQAYRYDMKGLVLSPFGNASFA 545 



ADF04606 

ID ADF04606 standard; protein; 605 AA. 
XX 

AC ADF04606; 
XX 

DT 12-FEB-2004 (first entry) 
XX 

DE Bacterial polypeptide #719. 
XX 

KW Proteus mirabilis infection; bacterial infection; antibacterial; 

KW immunostimulant . 

XX 

XX 

PN US6605709-B1. 
XX 

PD 12-AUG-2003. 

PF 05-APR-2000; 2000US-00543681 . 
XX 

PR 09-APR-1999; 99US-0128706P . 
XX 

PA (GENO-) GENOME THERAPEUTICS CORP. 
XX 

PI Breton GL; 



New Proteus mirabilis polypeptides and polynucleotides, useful as 
reagents for diagnosis of bacterial disease, as components of 
antibacterial vaccines, as targets for antibacterial drugs, or as 
biocontrol agents for plants. 

Disclosure; SEQ ID NO 4891; 870pp; English. 

The invention relates to new Proteus mirabilis polypeptides and 
polynucleotides. The invention also relates to antibodies against the 
polypeptides, methods for producing the polypeptides, a method of 
generating vaccines for immunising an individual against P. mirabilis, a 
method for evaluating a compound for the ability to bind a P. mirabilis 
polypeptide and a method for screening test compounds for anti-bacterial 
activity. The polypeptides and polynucleotides are useful as molecular 
targets for diagnosing, preventing and treating pathological conditions 
resulting from bacterial infection, as reagents for diagnosis of 
bacterial diseases, as components of antibacterial vaccines, as targets 
for antibacterial drugs or as bio-control agents for plants. This 
sequence represents a Proteus mirabilis polypeptide of the invention. 



SQ 


Sequence 


605 AA; 




Query Match 34.8%; Score 1021; DB 7; Length 605; 
Best Local Similarity 37.0%; Pred. No. 5.8e-82; 

Matches 202; Conservative 118; Mismatches 198; Indels 28; Gaps 


8, 


Qy 


16 


SQSVELQAAPSVPTFLTENGLTYCTHASGFSFNPQTADAGTSMNWTEQIYNKLFDIKNH 
— III II : :|| II 1 :|||l : :| :: : |:|::| |. . 
TKATELSVAQE-PTNIHQNGFVYCVDGSVNTFNPQLSSSGLIIDPLAAQLYDRLLDVDPY 


75 


Db 


67 


125 


Qy 


76 


SATLT PMLAQS YS I SADGKEI LLNLRHGVKFHQTPWFT PTRDFNAEDWFS INRVLGHNT 

: 1 1 : 1 : =1 Ill 1 : 1 1 : 1 1 1 : 1 

TYRLIPEIAARWESLDNGATYRFYLRKNVSFQTTPWFTPTRKLTADDVIFSFERMFSAN- 


135 


Db 


126 


184 


Qy 


136 


YLPTLAEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIKSVTALSPYQVKIELFAPD 


195 


Db 


185 




226 
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Qy 196 S SILSHLASQYAIIFSQEYAYQLSADDNLAQLDTHPVGTGPYQVKDYVYNQYVRLVRNEN 255 

Db 227 ASFLWHLATHYAPILSEEYASNLEKSGNQSQLDWKPVGSGPFYLDEFQPGQFVRLLRNEQ 286 

Qy 256 YWKKEAKIEHIIVDLSTDRSGRLVKFFNNECQIASYPEVSQIGLLKNDDKHYYMQSTDGM 315 

Db 287 YWKGQPKMQQVVTDTGAGGTGRISKLLTGEC DVLAYPAASQLKVLR-DDPRLRLTLRSGM 345 

Qy 316 NLAYLAFNFDKPLMRDHEIRAAISQSLNRARIIHSIYHNTATVANNIIPEVSWASTVNTP 375 

Db 346 NIAYLAFNTNKPPFNDLKVRQAIAYAINNERLMGSIYYGTAETAASVLPRASWAYD-NRA 404 

Qy 376 EFEFDYHPKIAKNKLAD- — KNLLLNLWVINEEQVYNPAPFKMAEMIKWDLAQAGVKVKV 432 

Db 405 KIT-EYNPEKSKQILKELGLEGLKLNLWVPSAPQSYNPSPLK^ELIQADLAQVGIQMNi 463 

Qy 433 RAVTRPFLTAQLRNQSENYDLILSGWLAGNLDPDGFMRPILSCGTKNELTNLSNWCNEEF 492 

Db 464 RPIEGRYQETSLMDRT--HDMTLSGWSTDSNDPDSFFRPLFSCAAISSQTNLSHWCSPAF 521 

Qy 493 DQFMDRAITTSHLSSRAKAYNEAQELVLRELPIIPIANVKRILVANSRVKGVKMTPFGSL 552 

Db 522 DNVLQQALYSQQLASRIDYYHQAQDILAQELPVLPLANSLRMQAYRYDIKGLVLSTFGNA 581 

Qy 553 DFSTLY 558 

Db 582 SFAGVY 587 

RESULT 5 
ABM69129 

ID ABM69129 standard; protein; 558 AA. 
XX 

AC ABM69129; 
XX 

DT 20-NOV-2003 (first entry) 
XX 

DE Photorhabdus luminescens protein sequence #222 6. 

kw H^ ib ? Ct6ri f '' fu "9 icide '- insecticide; polymorphism; genetic analysis; 

KW detection; food; gene expression; plant; animal; microorganism; toxin; 

KW antibiotic; biopesticide; virulence factor; disease model; plague; 

KW whooping cough. y 
XX 

OS Photorhabdus luminescens. 

PN WO200294867-A2. 
XX 

PD 28-NOV-2002. 
XX 

PF 07-FEB-2002; 2002WO-IB003040 . 
XX 

PR 07-FEB-2001; 2001FR-00001659. 



Glaser P, Frangeul L, Kunst F, Danchin A; 
WPI; 2003-148459/14. 

Genomic sequence of Photorhabdus luminescens and encoded polypeptides 
useful e.g. as therapeutic antimicrobials and agricultural pesticides! 

Claim 2; SEQ ID NO 2226; 1205pp; French. 

The invention relates to the isolation of genes and their encoded 
proteins from Photorhabdus luminescens. The isolated sequences are 
sources of probes and primers for detecting the genome of P luminescens 
and related species; to study polymorphisms; for gene analysis and for 
detection/amplification of the genes. Antibodies (Ab) raised against the 
polypeptides encoded by the genes are used for detection/identification 
of p. luminescens, e.g. in foods. The genes, proteins, Ab and cells that 
a gene-containing vector are used to select compounds that 

"" inhibit expression of the genes in plants, 
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CC animals or microorganisms other than P. luminescens and are able to alter 

CC response or sensitivity to toxins and antibiotics produced by p. 

CC luminescens. Cells transformed to express the genes are useful for 

CC recombinant production of the proteins, particularly toxins and 

CC antibacterials useful as insecticides, bactericides and fungicides. The 

CC genes, proteins, vectors containing the genes and Ab are also useful 

CC therapeutically (to treat microbial infection by bacteria or fungi that 

CC are sensitive to P. luminescens-encoded toxins or antibiotics) and as 

CC biopesticides. Other uses of the genes and the proteins are as virulence 

CC factors and for identifying targets of human diseases for which P. 

CC luminescens is a model (particularly plague and whooping cough) . This 

CC sequence represents one of the isolated P. luminescens proteins 

SQ Sequence 558 AA; 

Query Match 34.4%; Score 1010.5; DB 6; Length 558; 

Best Local Similarity 36.3%; Pred. No. 4.5e-81; 

Matches 205; Conservative 113; Mismatches 211; Indels 35; Gaps 6; 



Qy 


6 


LRFLSFLLCISQSVELQAA PSVPTFLTENGLTYCTHASGFSFNPQTADAGTS 


57 


Db 


4 


MRSLIYWIILSLSAPAIAETITTPEKNPHVPTDIQQQGFIYCVNGNLNTFNPQLASSGLT 63 


Qy 


58 


MNVVTEQI YNKLFDIKNHSATLTPMLAQSYSISADGKEILLNLRHGVKFHQTPWFTPTRD 


117 


Db 


64 


VDTLAAQLYERLLDVDPYTYRLLPELASHWEILDNGATYRFYLRHNVSFQSTDWFTPTRN 


123 


Qy 


118 


FNAEDVVFSINRVLGHNTYLPTLAEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIK 


177 


Db 


124 


MNADDVI FS FKRLFDKQHY YHNVNGGHYPYFDSLQLADSIQ 


164 


Qy 


178 


SVTALSPYQVKIELFAPDSSILSHLASQYAIIFSQEYAYQLSADDNLAQLDTHPVGTGPY 


237 


Db 


165 


SIRKLNEYTVEFRLNEPDASFLWHLATHYAPILSQEYGQQLHQMNRHEQIDWKPVGTGPF 224 


Qy 


238 


QVKDYVYNQYVRLVRNENYWKKEAKIEHIIVDLSTDRSGRLVKFFNNECQIASYPEVSQI 


297 


Db 


225 


MLEDHRTRQFIRLVRHDKYWKGKPQMRQIVIDVGAGGTGRMSKLLTGECDVLAYPAASQL 


284 


Qy 


298 


GLLKNDDKHYYMQSTDGMNLAYLAFNFDKPLMRDHEIRAAISQSLNRARIIHSI YHNTAT 

: 1 = 1 1 : 111:111111 II: : : I I | : :: | I :: | | | : | | 
TVLR-DDPRLRLTLRPGMNIAYLAFNTSKPPLDKLQVRQAIAYAINNQRLMQSIYYGTAE 


357 


Db 


285 


343 


Qy 


358 


VANNIIPEVSWASTVNTPEFEFDYHPKIAK-— NKLADKNLLLNLWVINEEQVYNPAPFK 


414 


Db 


344 


TASSILPRASWAYDNQTEITE--YNPEKSRKILNDLGLNGLQLSLWVPSASQSYNPSPLK 


401 


Qy 


415 


MAEMIKWDLAQAGVKVKVRAVTRPFLTAQLRNQSENYDLILSGWLAGNLDPDGFMRPILS 


474 


Db 


402 


MAELIQADLAQVGITMSIKPVEGRFQETKLMDKS — HDMTLSGWSTDSNDPDSFFRPLLS 


459 


Qy 


475 


CGTKNELTNLSNWCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPIIPIANVKRI 


534 


Db 


460 


CAAIASQTNFSHWCEPTFDKILREALINQQLLSRIKYYHAAQQVLEQQLP1LPLAYSLHL 


519 


Qy 


535 


LVANSRVKGVKMTPFGSLDFSTLY 558 




Db 


520 


QA YRH D I KGL VLS P FGNTS FAGVY 54 3 





RESULT 6 
ABU30402 

ID ABO30402 standard; protein; 549 AA. 
XX 

AC ABU30402; 
XX 

DT 19-JUN-2003 (first entry) 
XX 

DE Protein encoded by Prokaryotic essential gene #15929. 
XX 

KW Antisense; prokaryotic essential gene; cell proliferation; drug design. 

OS Haemophilus influenzae. 

PN WO200277183-A2. 
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PD 03-OCT-2002. 

PF 21-MAR-2002; 2002WO-US009107 . 
XX 

PR 21-MAR-2001; 2001US-00815242 . 

PR 06-SEP-2001; 2001US-00948993 . 

PR 25-OCT-2001; 2001US-0342923P. 

PR 08-FEB-2002; 2002US-0007285 1 . 

PR 06-MAR-2002; 2002US-0362699P. 
XX 

PA (ELIT-) ELITRA PHARM INC. 



PT New antisense nucleic acids, useful for identifying proteins or screening 

PT for homologous nucleic acids required for cellular proliferation to 

PT isolate candidate molecules for rational drug discovery programs. 

PS Claim 25; SEQ ID NO 58326; 1766pp; English. 
XX 

CC The invention relates to an isolated nucleic acid comprising any one of 

CC the 6213 antisense sequences given in the specification where expression 

CC of the nucleic acid inhibits proliferation of a cell. Also included are: 
CC (Da vector comprising a promoter operably linked to the nucleic acid 

CC encoding a polypeptide whose expression is inhibited by the antisense 

CC nucleic acid; (2) a host cell containing the vector; (3) an isolated 

CC polypeptide or its fragment whose expression is inhibited by the 

CC antisense nucleic acid; (4) an antibody capable of specifically binding 

CC the polypeptide; (5) producing the polypeptide; (6) inhibiting cellular 

CC proliferation or the activity of a gene in an operon required for 

CC proliferation; (7) identifying a compound that influences the activity of 

CC the gene product or that has an activity against a biological pathway 

CC required for proliferation, or that inhibits cellular proliferation; (8) 

CC identifying a gene required for cellular proliferation or the biological 

CC pathway in which a proliferation-required gene or its gene product lies 

CC or a gene on which the test compound that inhibits proliferation of an 

CC organism acts; (9) manufacturing an antibiotic; (10) profiling a 

CC compound's activity; (11) a culture comprising strains in which the gene 

CC product is overexpressed or underexpressed; (12) determining the extent 

CC to which each of the strains is present in a culture or collection of 

CC strains; or (13) identifying the target of a compound that inhibits the 

CC proliferation of an organism. The antisense nucleic acids are useful for 

CC identifying proteins or screening for homologous nucleic acids required 

CC for cellular proliferation to isolate candidate molecules for rational 

CC drug discovery programs, or for screening homologous nucleic acids 

CC required for proliferation in cells other than S. aureus, S. typhimurium, 

CC K. pneumoniae or P. aeruginosa. The present sequence is encoded by one of 

CC the target prokaryotic essential genes. Note: The sequence data for " " 



1 not foi 



1 specification, but was obtained 



CC in electronic format directly from WIPO a. 
CC ftp.wipo. int/pub/published_pct_sequences 

SQ Sequence 54 9 AA; 

Query Match 26.0%; Score 764.5; DB 6; Length 549; 

Best Local Similarity 32.7%; Pred. No. 4.7e-59; 

Matches 178; Conservative 100; Mismatches 227; Indels 39; Gaps 10; 
2y 16 SQSVELQAAPSVPTFLTENGLTYCTHASGFSFNPQTADAGTSMNVVTEQIYNKLFDIKNH 75 

3b 24 sssankstaqteaksssnntfvyctakaplgfspaliiegtLnassqqvynrlvefkkg 83 

2y 76 satltpmlaqsysisadgkeillnlrhgvkfhqtpwftptrdfnaedwfsinrvlghnt 135 

3b 84 STDIEPALAESWEISDDGLSYTFHLRKGVKFHTTKEFTPTRDFNADDVVFSFQRQLDPN- 142 

136 YLPTLAEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIKSVTALSPYQVKIELFAPD 195 
:l : I I : I I I :: I I : I I I : :: | | | 

3b 143 HPYHNV SKGTYPYFKAMKFPELLKSVEKVDDNTIRITLNKTD 184 

J y 196 SSILSHLASQYA1IFSQEYAYQLSADDNLAQLDTHPVGTGPYQVKDYVYNQYVRLVRNEN 255 
:: 1= I = 1:1111 : II: I : || :| :: | :|| 
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Db 


185 






Qy 


256 


YWKKEAKIEHI IVDLSTDRSGRLVKFFNNECQIASYPEVSQIGLLKNDDKHYYMQSTDGM 




Db 


245 


:::::: I : | | 1 : :| |: : :| | | :: | : 
YWKGRTPLDRLVISIVPDATTRYAKLQAGTCDLILFPNVADLAKMKTDPKVQLLEQ-KGL 




Qy 


316 






Db 


304 


NVAYIAFNTEKAPFDNVKVRQALNYAVDKKAIIEAVYQGAGTSAKNPLPPTIW--SYNDE 




Qy 


376 


IoliminnP^ KNLLL ^»yiNEEQVYNPAPFKMAEMIKWDLAQAGyKVK 




Db 


3 62 


VRAVTPP--FLZ L LpTl™ir D ™^ 




Qy 


432 


MM 1 ::: N T LSGWLAGNLDPDGFMRPILSCGTKN-ELTNLSN 


486 






TNPVTYEWADYRKRAKEGELTAGIFGWSGDNGDPDNFLSPLL — GSSNIGNSNMAR 


473 


Qy 


487 


WCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPIIPIANVKRILVANSRVKGVKM 


546 


Db 


474 


FNNSEFDALLNEAIGLTNKEERAKLYKQAQVIVHNQAPWIPVAHSVGFAPLSPRVKGYVQ 


533 


Qy 


547 


TPFG 550 




Db 


534 


: 1 1 1 

SPFG 537 





RESULT 7 
AAO17804 

ID AAO17804 standard; protein; 549 AA 
XX ' . 

AC AAO17804; 

DT 05-AUG-2002 (first entry) 
XX 

DE H influenzae BVH-NTHI3 protein SEQ ID t> 



; infection; BVH-NTHI1; otitis media; BVH-NTHI2; 
• pneumonia; meningitis; bacteraemia; BVH-NTHI3 ; 
—116; BVH-NTHI7; BVH-NTHI8; BVH-NTHI9; 

' '.nflammatory; auditory; 



Haemophilus influenzae. 
WO200228889-A2. 
ll-APR-2002 . 

02-OCT-2001; 2001WO-CA001402 . 
02-OCT-2000; 2000US-0236712P. 
(SHIR-) SHIRE BIOCHEM INC. 



Hamel J, Couture F, Brodeur BR, Martin D, Ouellet C, Tremblay M; 
Charbonneau A, Vayssier C; 



Novel isolated Haemophilus influenzae polypeptide; 
for inducing protective immune responses against f 
and for treating otitis media, sinusitis, bronchit 



Claim 17; Fig 6; 58pp; Englii 



The present invention provides the protein and coding sequences of 
Haemophilus influenzae BVH-NTHI1-12 . The sequences can be used in the 
production of a vaccine to protect against, and in the diagnosis of, H 
influenzae infection, which can lead to otitis media, sinusitis, 
bronchitis, pneumonia, meningitis and bacteraemia. The present seouence 
is a protein of the invention 

Sequence 549 AA; 
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Query Match 26.0%; Score 763.5; DB 5; Length 549; 
Best Local Similarity 32.7%; Pred. No. 5.8e-59; 

Matches 178; Conservative 100; Mismatches 227; Indels 39; Gaps 10; 

Qy 16 SQSVELQAAPSVPTFLTENGLTYCTHASGFSFNPQTADAGTSMNWTEQIYNKLFDIKNH 75 

Db 24 SSSANKSTAQTEAKSSSNNTFVYCTAKAPLGFSPALIIEGTSYNASSQQVYNRLVEFKKG 83 

Qy 76 SATLTPMLAQSYSISADGKEILLNLRHGVKFHQTPWFTPTRDFNAEDVVFSINRVLGHNT 135 

Db 84 STDIEPALAESWEISDDGLSYTFHLRKGVKFHTTKEFTPTRDFNADDVVFSFQRQLDPN- 142 

Qy 136 YLPTLAEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIKSVTALSPYQVKIELFAPD 195 

Db 143 HPYHNV SKGTYPYFKAMKFPELLKSVEKVDDNTIRITLNKTD 184 

Qy 196 SSILSHLASQYAI IFSQEYAYQLSADDNLAQLDTHPVGTGPYQVKDYVYNQYVRLVRNEN 255 

Db 185 ATFLASLGMDFISI YSAEYADSMLKAGKPETLDSRPVGTGPFVFVDYKTDQAIQYVAHEN 244 

Qy 256 YWKKEAKIEHIIVDLSTDRSGRLVKFFNNECQIASYPEVSQIGLLKNDDKHYYMQSTDGM 315 

Db 245 YWKGRTPLDRLVISIVPDATTRYAKLQAGTCDLILFPNVADLAKMKTDPKVQLLEQ-KGL 303 

Qy 316 NLAYLAFNFDKPLMRDHEIRAAISQSLNRARIIHSI YHNTATVANNIIPEVSWASTVNTP 375 

Db 304 NVAYIAFNTEKAPFDNVKVRQALNYAVDKKAIIEAVYQGAGTSAKNPLPPTIW--SYNDE 361 

Qy 376 EFEFDYHPKIAKNKLAD KNLLLNLWVINEEQVYNPAPFKMAEMIKWDLAQAGVKVK 431 

Db 362 IQDYPYDPEKAKQLLAEAGYPNGFETDFWIQPIIRASNPNPKRMAELIMADWAKIGVK-- 419 

Qy 432 VRAVTRP--FLTAQLRNQSENYDLI — LSGWLAGNLDPDGFMRPILSCGTKN-ELTNLSN 486 

Db 420 TNPVTYEWADYRKRAKEGELTAGIFGWSGDNGDPDNFLSPLL--GSSNIGNSNMAR 473 

Qy 487 WCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPI I PIANVKRILVANSRVKGVKM 546 

: I I I I : : I I : : I I I I : I I : I : I I I : I : : I I I I 

Db 474 FNNSEFDALLNEAIGLTNKEERAKLYKQAQVIVHNQAPWIPVAHSVGFAPLSPRVKGYVQ 533 

Qy 547 TPFG 550 
: I I I 

Db 534 SPFG 537 

RESULT 8 
ABU19675 

ID ABU19675 standard; protein; 542 AA. 
XX 

AC ABU19675; 
XX 

DT 19-JUN-2003 (first entry) 
XX 

DE Protein encoded by Prokaryotic essential gene #5202. 
XX 

KW Antisense; prokaryotic essential gene; cell proliferation; drug design. 
XX 

OS Borrelia cepacia. 
XX 

PN WO200277183-A2. 

PD 03-OCT-2002. 
XX 

PF 21-MAR-2002; 2002WO-US009107 . 
XX 

PR 21-MAR-2001; 2001US-00815242 . 

PR 06-SEP-2001; 2001US-00948993 . 

PR 25-OCT-2001; 2001US-0342923P. 

PR 08-FEB-2002; 2002US-00072851 . 

PR 06-MAR-2002; 2002US-0362699P. 

PA (ELIT-) ELITRA PHARM INC. 
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DR WPI; 2003-029926/02. 

DR N-PSDB; ACA23545. 
XX 

PT New antisense nucleic acids, useful for identifying proteins or screening 

PT for homologous nucleic acids required for cellular proliferation to 

PT isolate candidate molecules for rational drug discovery programs. 

PS Claim 25; SEQ ID NO 47599; 1766pp; English. 

XX 

CC The invention relates to an isolated nucleic acid comprising any one of 

CC the 6213 antisense sequences given in the specification where expression 

CC of the nucleic acid inhibits proliferation of a cell. Also included are: 

CC (1) a vector comprising a promoter operably linked to the nucleic acid 

CC encoding a polypeptide whose expression is inhibited by the antisense 

CC nucleic acid; (2) a host cell containing the vector; (3) an isolated 

CC polypeptide or its fragment whose expression is inhibited by the 

CC antisense nucleic acid; (4) an antibody capable of specifically binding 

CC the polypeptide; (5) producing the polypeptide; (6) inhibiting cellular 

CC proliferation or the activity of a gene in an operon required for 

CC proliferation; (7) identifying a compound that influences the activity of 

CC the gene product or that has an activity against a biological pathway 

CC required for proliferation, or that inhibits cellular proliferation; (8) 

CC identifying a gene required for cellular proliferation or the biological 

CC pathway in which a proliferation-required gene or its gene product lies 

CC or a gene on which the test compound that inhibits proliferation of an 

CC organism acts; (9) manufacturing an antibiotic; (10) profiling a 

CC compound's activity; (11) a culture comprising strains in which the gene 

CC product is overexpressed or underexpressed; (12) determining the extent 

CC to which each of the strains is present in a culture or collection of 

CC strains; or (13) identifying the target of a compound that inhibits the 

CC proliferation of an organism. The antisense nucleic acids are useful for 

CC identifying proteins or screening for homologous nucleic acids required 

CC for cellular proliferation to isolate candidate molecules for rational 

CC drug discovery programs, or for screening homologous nucleic acids 

CC required for proliferation in cells other than S. aureus, S. typhimurium, 

CC K. pneumoniae or P. aeruginosa. The present sequence is encoded by one of 

CC the target prokaryotic essential genes. Note: The sequence data for this 

CC patent did not form part of the printed specification, but was obtained 

CC in electronic format directly from WIPO at 

CC ftp.wipo. int/pub/published_pct_sequences 
XX 

SQ Sequence 542 AA; 

Query Match 25.8%; Score 757; DB 6; Length 542; 

Best Local Similarity 31.9%; Pred. No. 2.2e-58; 

Matches 175; Conservative 88; Mismatches 242; Indels 44; Gaps 7; 

Qy 18 SVELQAAPSVPTFLTENGLTYCTHASGFSFNPQTADAGTSMNVVTEQIYNKLFDIKNHSA 77 

: I I I : 111:1 |: I I : I I : I : : 

Db 20 AASLGVAGSAFAQIPNKT LVYCSEGSPAGFDSAQFTTGVDFTAATFTVYNRLVEFERGGT 79 

Qy 78 TLTPMLAQSYSISADGKEILLNLRHGVKFHQTPWFTPTRDFNAEDVVFSINRVLGHNTYL 137 

Db 80 KVEPGLAEKWDVSSDGKVYTFHLRHGVKFHTTDFFKPTREFNADDVVFSFQRMLDPNNAF 139 

Qy 138 PTLAEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIKSVTALSPYQVKIELFAPDSS 197 

Db 140 RKAYPVSFPYFTDMGLDKLITKVEKVDPYTVKFTLAEPNAP 180 

Qy 198 ILSHLASQYAI1FSQEYAYQLSADDNLAQLDTHPVGTGPYQVKDYVYNQYVRLVRNENYW 257 

: ::| ::| I I II II I :: : : I : :| I :|| 

Db 181 FIQNMAMEFASILSAEYGDQLMKAGRAADINQKPVGTGPFIFRSYTKDATIRFDGNPDYW 240 

Qy 258 KK-EAKIEHIIVDLSTDRSGRLVKFFNNECQIASYPEVSQIGLLKNDDKHYYMQSTDGMN 316 

Mil: : I : : I I : I Mil: III : I II I : I I II 
Db 241 KKGEVKLSKLIFSITPDPGVRVQKIKRNECQVMSYPRPADIATLK-ADSNVDMPSQAGFN 299 

Qy 317 LAYLAFNFDKPLMRDHEIRAAISQSLNRARIIHSI YHNTATVANNI IPEVSWASTVNTPE 376 

Db 300 LGYLAYNVEHKPVDKLEVRQALDMAINKKAILESVYQGAGQAASAPMPPTQWS 352 

Qy 377 FEFDYHPKIAKNKLADKNLLL NLWVINEEQVYNPAPFKMAEMIKWDLAQ 425 

Db 353 --YDKNLKMAAYDTAKAKALLAKAGFPNGFEITLWAMPVQRAYNPNARLMAEMIQADWAK 410 

Qy 426 AGVKVKVRAVTRPFLTAQLRNQSENYDLILSGWLAGNLDPDGFMRPILSCGTKNELTNLS 485 
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M I I : I I : I : I : I I I I III : : : I I I I 

Db 411 IGVKAKI--VTYEWGEYIKRAHAGEQDTMLIGWTGDNGDPDNWLGTLLGCEAIKG-NNFS 467 

Qy 486 NWCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPIIPIANVKRILVANSRVKGVK 545 

Db 468 HWCYKPFDELVQKGRTTTGQDARTKLYTQAQQIFAQQLPFSPIANSTVYQPVRKNVVDMR 527 

Qy 546 MTPFGSLDF 554 

Db 528 IEPLGYARF 536 

RESULT 9 
ADT05677 

ID ADT05677 standard; protein; 540 AA. 
XX 

AC ADT05677; 

XX 

DT 02-DEC-2004 (first entry) 
XX 

DE Haemophilus influenzae (NTHi) protein - SEQ ID 713. 

XX 

KW middle ear bacterial infection; nasopharynx bacterial infection. 
XX 

OS Haemophilus influenzae. 
XX 

PN WO2004078949-A2. 

XX 

PD 16-SEP-2004. 
XX 

PF 05-MAR-2004; 2004WO-US007001 . 
XX 

PR 06-MAR-2003; 2003QS-0453134P . 
XX 

PA (CHIL-) CHILDRENS HOSPITAL INC. 
XX 

PI Bakaletz LO, Munson RS, Dyer DW; 
XX 

DR WPI; 2004-662422/64. 

DR N-PSDB; ADT05676. 

PT New polynucleotides of nontypeable strain of Haemophilus influenzae, 

PT useful for treating or preventing NTHi bacterial infections of the middle 

PT ear and/or nasopharynx. 

PS Claim 3; SEQ ID NO 713; 88pp; English. 

CC The invention comprises nucleotide sequences (genes) from the genome of a 

CC nontypeable strain of Haemophilus influenzae (NTHi) . The NTHi DNA 

CC sequences of the invention are useful for treating or preventing NTHi 

CC bacterial infections of the middle ear and/or nasopharynx. The present 

CC amino acid sequence represents an NTHi protein of the invention. 

XX 

SQ Sequence 540 AA; 

Query Match 25.7%; Score 756; DB 8; Length 540; 
Best Local Similarity 32.4%; Pred. No. 2.6e-58; 

Matches 180; Conservative 103; Mismatches 232; Indels 40; Gaps 11; 

Qy 6 LRFLSFLLCIS-QSVELQAAPSVPTFLTENGLTYCTHASGFSFNPQTADAGTSMNVVTEQ 64 

I : I : I I : I I : : I III : | : | | | | | : : | 

Db 4 LQLLFWQLVINLASANKSTAQTEAKSSSNNTFVYCTAKAPLGFSPALIIEGTSYNASSQQ 63 

Qy 65 IYNKLFDIKNHSATLTPMLAQSYSISADGKEILLNLRHGVKFHQTPWFTPTRDFNAEDVV 124 

= 11:1 : I I : I ll:|: II II : I I I I I I I I III :||| 

Db 64 VYNRLVEFKKGSTDIEPALAESWEISDDGLSYTFHLRKGVKFHTTKEFTPTRDFNADDW 123 

Qy 125 FSINRVLGHNTYLPTLAEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIKSVTALSP 184 

Db 124 FSFQRQLDPN HPYHNV SKGTYPYFKAMKFPELLKSVEKVDD 164 

Qy 185 YQVKIELFAPDSSILSHLASQYAIIFSQEYAYQLSADDNLAQLDTHPVGTGPYQVKDYVY 244 

:; l I I : : I : I : I : I III : I I : I I I I I I : II 

Db 165 NTIRITLNKTDATFLASLGMDFISIYSAEYADSMLKAGKPETLDSRPVGTGPFVFVDYKT 224 

Qy 245 NQYVRLVRNENYWKKEAKIEHIIVDLSTDRSGRLVKFFNNECQIASYPEVSQIGLLKNDD 304 
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: 1 : : • : I I I I I :::::: | : | | | . ., . ., , 

225 DQAIQYVAHENYWKGRTPLDRLVISIVPDATTRYAKLQAGTCDLILFPNVADLAKMKTDP 284 
305 KHYYMQSTDGMNLAYLAFNFDKPLMRDHEIRAAISQSLNRARIIHSIYHNTATVANNIIP 364 
285 KVQLLEQ-KGLNVAYIAFNTEKAPFDNVKVRQALNYAVDKKAIIEAVYQGAGISAKNPLP 343 
365 EVSWASTVNTPEFEFDYHPKIAKNKLAD— -KNLLLNLWVINEEQVYNPAPFKMAEMIK 420 
344 PTIW--SYNDEIQDYPYDPEKAKQLLAEAGYPNGFETDFWIQPIIPJ\SNPNPKRMAELIM 401 
421 WDLAQAGVKVKVRAVTRP--FLTAQLRNQSENYDLI--LSGWLAGNLDPDGFMRPILSCG 476 

402 ADWAKIGVK TNPVTYEWADYRKRAKEGELTAGIFGWSGDNGDPDN FLSPLL--G 453 

477 TKN-ELTNLSNWCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPIIPIANVKRIL 535 
454 SSNIGNSNMARFNNSEFDALLNEAIGLTNKEERAKLYKQAQVIVHNQAPWIl 
536 VANSRVKGVKMTPFG 550 
514 PLSPRVKGYVQSPFG 528 



PVAHSVGFA 513 



RESULT 10 
ABU21963 

ID ABU21963 standard; protein; 542 AA 
XX 

AC ABU21963; 
XX 

DT 19-JON-2003 (first entry) 



start | next page 

"' SCORE 1.3 BuildDate: 11/17/2006 
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SCO^i Seatrcth Olesuffcs DettalOs for Application 

aind Search IResyQt 
2TO@a.2a.8_3,3,gH@_ys°a.@°®Q7°74@-7.rup. 

Score Home Retrieve Application SCORE Sy stem SC ORE C omments / 

Page List Overview FAQ Suggestions 

This page gives you Search Results detail for the Application 10807746 and Search Result 

20061218_115220_us-10-807-746-7.rup. 

start | next page 

Go Back to previous page 



GenCore version 5.1.9 
Copyright (c) 1993 - 2006 Biocceleration Ltd. 



• protein search, using sw model 
December 18, 2006, 19:54:23 



Search time 304 Seconds 
(without alignments) 
1716.147 Million cell updates/se 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



US-10-807-746-7 
2937 

1 MLRLNLRFLS FLLCI SQSVE KMTPFGSLDFSTLYFIQEKH 564 



2849598 seqs, 925015592 residues 



Total number of hits sat: 



i parameters: 



Pred. No. is the number of results 
score greater than or equal to the 
and is derived by analysis of the t 



i by chance to have a 
the result being printed, 
re distribution. 



2937 
2895.5 
2219 
1965 
1892.5 
1752.5 
1267.5 
1030.5 
1027.5 
1025.5 
1023.5 
1023.5 
1022.5 
1022.5 



35.1 
35.0 
34.9 
34.8 
34.8 



564 2 Q4QL7 3_HAEI8 

565 1 SAPA_HAEIN 
540 2 Q714U3_HAEIN 
563 2 Q9CMC1_PASM0 
567 2 Q65U97_MANSM 
561 2 Q3EG24_ACTSC 
560 2 Q7VM01_HAEDU 
547 2 Q32FZ3_SHIDS 
547 2 Q33RL7_SHIFL 
54 7 2 Q8CW41_ECOL6 
54 7 1 SAPA_ECOLI 
547 2 Q7UCQ5_SHIFL 
547 2 Q31ZZ0_SHIBS 
547 2 Q3Z142_SHISS 



Q714u: 

Q65u9' 
Q3eg2< 
Q7vm01 
Q32fz3 
Q83rl7 
Q8cw41 
Q47622 

Q31zz0 
Q3zl42 



! haemophilus 
5 haemophilus 
J haemophilus 

haemophilus 
shigella dy 
shigella fl 
escherichia 
escherichia 
shigella fl 
shigella bo 
shigella so 
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15 


1022.5 


34.8 


549 


1 


SAPA SALTY 


P36634 


salmonella 


16 


1022.5 


34.8 


549 


2 


Q5PCZ2 


SALPA 


Q5pcz2 


salmonella 


17 


1022.5 


34.8 


557 


2 


Q57NX0~ 


"SALCH 


Q57nx0 


salmonella 


18 


1019.5 


34.7 


549 


2 


Q8Z7B5" 


"SALT I 


Q8z7b5 


salmonella 


19 


1018.5 


34.7 


547 


2 


Q8X7F3" 


~EC057 


Q8x7f3 


escherichia 


20 


1017.5 


34.6 


547 


2 


Q3MSG2" 


"kleox 


Q3msg2 


klebsiella 


21 


1017.5 


34.6 


547 


2 


Q3MSI5" 


"klepn 


Q3msi5 


klebsiella 


22 


1012 


34.5 


548 


2 


Q8ZE31~ 


"yerpe 


Q8ze31 


yersinia pe 


23 


1010.5 


34.4 


554 


2 


Q7N3X5" 


"pHOLL 


Q7n3x5 


photorhabdu 


24 


1007.5 


34.3 


547 


2 


Q66A60" 


JfERPS 


Q66a60 


yersinia ps 


25 


997.5 


34.0 


562 


2 


Q6D5R3" 


"erwct 


Q6d5r3 




26 


968 


33.0 


539 


2 


Q6LPF2" 


"PHOPR 


Q61pf2 


photobacter 


27 


964 


32.8 


540 


2 


086187" 


ERWCH 


086187 


erwinia chr 


28 


940.5 


32.0 


595 


2 


Q2NSU4 


SODGL 


Q2nsu4 


sodalis glo 


29 


928 


31.6 


541 


2 


Q35UU8" 


"9GAMM 


Q35uu8 


shewanella 


30 


915 


31.2 


541 


2 


Q2Z478 


9GAMM 


Q2z478 


shewanella 


31 


914.5 


31.1 


535 


2 


Q3IF64" 


~PSEHT 


Q3if64 


pseudoalter 


32 


910 


31.0 


541 


2 


Q366E0 


~9GAMM 


Q366e0 


shewanella 


33 


902.5 


30.7 


541 


2 


Q3NQ27" 


~SHEFR 


Q3nq27 


shewanella 


34 


901.5 


30.7 


541 


2 


Q2WY77" 


9GAMM 


Q2wy77 


shewanella 


35 


899.5 


30.6 


541 


2 


Q2ZMV8 


"SHEPU 


Q2zmv8 


shewanella 


36 


898 


30.6 


542 


2 


Q3QDN5" 


9GAMM 


Q3qdn5 


shewanella 


37 


891 


30.3 


556 


2 


Q47XM8" 


"COLP3 


Q47xm8 


colwellia p 


38 


886.5 


30.2 


539 


2 


Q87QH8" 


VIBPA 


Q87qh8 


V h bri ° ll" 












Q3Q323" 


9GAMM 






40 


882.5 


30.0 


525 


2 


Q8EG09 


SHEON 


Q8eg09 


shewanella 


41 


879 


29.9 


547 


2 


Q33QG6 


~9GAMM 


Q33qg6 


shewanella 


42 


877.5 


29.9 


541 


2 


Q5E0R7 


VIBFl 


Q5e0r7 


vibrio fisc 


43 


871.5 


29.7 


541 


2 


Q8KUE4 


VIBFI 


Q8kue4 


vibrio fisc 




856 


29. 1 


540 


2 


Q9KRG2 


VIBCH 


Q9krg2 


vibrio chol 


45 


854.5 


29.1 


541 


2 


Q5QUD5 


~IDILO 


Q5qud5 


idiomarina 



ALIGNMENTS 



RESULT 1 
Q4QL73_HAEI8 

ID Q4QL73_HAEI8 PRELIMINARY; PRT; 564 AA. 

AC Q4QL73; 

DT 19-JUL-2005, integrated into UniProtKB/TrEMBL. 

DT 19-JUL-2005, sequence version 1. 

DT 07-FEB-2006, entry version 5. 

DE ABC-type transport system, periplasmic component, involved in 

DE antimicrobial peptide resistance. f 

GN Name=sapA; OrderedLocusNames=NTHI1401; qA 

OS Haemophilus influenzae (strain 86-028NP) . T ^ 

OC Bacteria; Proteobacteria; Gammaproteobacteria; Pasteurellales ; \ r 

OC Pasteurellaceae; Haemophilus. C\ \/ 

OX NCBI_TaxID=281310; ^ f^TN 

RN [1] , ^ \) 

RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA] . W\ 

RX PubMed=15968074; DOI=10 . 1128/ JB . 187 . 13 . 4 627-4 636 . 2005 ; /V\ 

RA Harrison A., Dyer D.W., Gillaspy A., Ray W.C., Mungur R., Carson M.B., 

RA Zhong H., Gipson J., Gipson M . , Johnson L.S., Lewis L., Bakaletz L.O., 

RA Munson R.S. Jr. ; 

RT "Genomic sequence of an otitis media isolate of nontypeable 

RT Haemophilus influenzae: comparative study with H. influenzae serotype 

RT d, strain KW20."; 

RL J. Bacterid. 187:4627-463^(2005). ) 



OM?. 

p: / /w«w\uniprot . org/tei 



EMBL; CP000057; AAX88224.1; -; Genomic_DNA. 

GO; GO:0005215; F : transporter activity; IEA. 

GO; GO:0006810; P:transport; IEA. 

InterPro; IPR000914; SBP_bac_5. 

Pfam; PF00496; SBP_bac_5; 1. 

PROSITE; PS01040; SBP_BACTERIAL_5 ; 1. 

Complete proteome. 

SEQUENCE 564 AA; 64420 MW; 0EB25C1FFA952643 CRC64; 

100.0%; Score 2937; DB 2; Length 564; 
100.0%; Pred. No. 1.3e-179; 
ative 0; Mismatches 0; Indels 0, 
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Qy 1 MLRLNLRFLSFLLCISQSVELQAAPSVPTFLTENGLTYCTHASGFSFNPQTADAGTSMNV 60 

Mlllllllllll HI MUM I lilt | | 

Db 1 MLRLNLRFLSFLLCISQSVELQAAPSVPTFLTENGLTYCTHASGFSFNPQTADAGTSMNV 60 

QY 61 VTEQIYNKLFDIKNHSATLTPMLAQSYSISADGKEILLNLRHGVKFHQTPWFTPTRDFNA 120 

I I I I I I I I I I I I I I I I I I I I I | | I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | I I I I I I 
Db 61 VTEQIYNKLFDIKNHSATLTPMLAQSYSISADGKEILLNLRHGVKFHQTPWFTPTRDFNA 120 

Qy 121 EDVVFSINRVLGHNTYLPTLAEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIKSVT 180 

I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 EDVVFSINRVLGHNTYLPTLAEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIKSVT 180 

Qy 181 ALSPYQVKIELFAPDSS1LSHLASQYAII FSQEYAYQLSADDNLAQLDTHPVGTGPYQVK 240 

Db 181 ALSPYQVKIELFAPDSSILSHLASQYAII FSQEYAYQLSADDNLAQLDTHPVGTGPYQVK 240 

Qy 241 DYVYNQYVRLVRNENYWKKEAKIEHIIVDLSTDRSGRLVKFFNNECQIASYPEVSQIGLL 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 DYVYNQYVRLVRNENYWKKEAKIEHIIVDLSTDRSGRLVKFFNNECQIASYPEVSQIGLL 300 

Qy 301 KNDDKHYYMQSTDGMNLAYLAFNFDKPLMRDHEIRAAISQSLNRARIIHSIYHNTATVAN 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 KNDDKHYYMQSTDGMNLAYLAFNFDKPLMRDHEIRAAISQSLNRARIIHSIYHNTATVAN 360 

QY 361 NIIPEVSWASTVNTPEFEFDYHPKIAKNKLADKNLLLNLWVINEEQVYNPAPFKMAEMIK 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 361 NIIPEVSWASTVNTPEFEFDYHPKIAKNKLADKNLLLNLWVINEEQVYNPAPFKMAEMIK 420 

Qy 421 WDLAQAGVKVKVRAVTRPFLTAQLRNQSENYDLILSGWLAGNLDPDGFMRPILSCGTKNE 480 

Db 421 WDLAQAGVKVKVRAVTRPFLTAQLRNQSENYDLILSGWLAGNLDPDGFMRPILSCGTKNE 480 

Qy 481 LTNLSNWCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPIIPIANVKRILVANSR 540 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 LTNLSNHCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPIIPIANVKRILVANSR 540 

Qy 541 VKGVKMTPFGSLDFSTLYFIQEKH 564 

I I I I I I II I 

Db 541 VKGVKMTPFGSLDFSTLYFIQEKH 564 

RESULT 2 
SAPA_HAEIN 

ID SAPA_HAEIN STANDARD; PRT; 565 AA. 

AC P45285; 

DT 01-NOV-1995, integrated into UniProtKB/Swiss-Prot . 

DT 01-NOV-1995, sequence version 1. 

DT 07-MAR-2006, entry version 37. 

DE Peptide transport periplasmic protein sapA precursor. \ £» 

GN Name=sapA; OrderedLocusNames=HI1638; ^jL 

OS Haemophilus influenzae. [» 

OC Bacteria; Proteobacter ia ; Gammaproteobacteria; Pasteurellales; 

OC Pasteurellaceae; Haemophilus. 

OX NCBI _TaxID=727; 

RN [1] 

RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA] . 

RC STRAIN=Rd / KW20 / ATCC 51907; 

RX MEDLINE=95350630; PubMed=7542800; 

RA Fleischmann R.D., Adams M.D., White O., Clayton R.A., Kirkness E.F., 

RA Kerlavage A.R., Bult C.J., Tomb J.-F., Dougherty B.A., Merrick J.M., 

RA McKenney K. , Sutton G.G., FitzHugh W., Fields C.A., Gocayne J.D., 

RA Scott J.D., Shirley R. , Liu L.-I., Glodek A., Kelley J.M., 

RA Weidman J.F., Phillips C.A., Spriggs T . , Hedblom E., Cotton M.D., 

RA Utterback T.R., Hanna M.C., Nguyen D.T., Saudek D.M., Brandon R.C., 

RA Fine L.D., Fritchman J.L., Fuhrmann J.L., Geoghagen N.S.M., 

RA GnehmC.L., McDonald L.A., Small K.V., Fraser CM., Smith H.O., 

RA Venter J.C. ; 

RT "Whole-genome random sequencing and assembly of Haemophilus influenzae 

RT Rd 

RL Science 269:4 96-512(1995). 

CC -!- FUNCTION: Involved in a peptide intake transport system that plays 
cc a role in the resistance to antimicrobial peptides (By 

CC similarity) . 

CC -!- SUBCELLULAR LOCATION: Periplasmic (Probable). 

CC -!- SIMILARITY: Belongs to the bacterial solute-binding protein 5 
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CC family. 

cc 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

cc 

DR EMBL; L42023; AAC23285.1; -; Genomic_DNA. 

DR PIR; A64134; A64134. 

DR HSSP; P23847; 1DPE. 

DR GenomeReviews; L42023_GR; HI1638. 

DR TIGR; HI 1638; -. 

DR BioCyc; HINF7 1421 : HI 163 8 -MONOMER; -. 

DR InterPro; IPR000914; SBP_bac 5. 

DR Pfam; PF00496; SBP_bac_5; l.~ 

DR PROSITE; PS01040; SBP_BACTERIAL_5 ; 1. 

KW Complete proteome; Peptide transport; Periplasmic; Protein transport; 



KW Signal; Transport. ■ ' 

FT SIGNAL 1 23 Potential. 

FT chain 24 565 Peptide transport periplasmic protein 

FT sapA. 

FT /FTId=PRO_0000031803. 

SQ SEQUENCE 565 AA; 64504 MW; 449E454F1278C2A7 CRC64; 



Query Match 98.6%; Score 2895.5; DB 1; Length 565; X 

Best Local Similarity 98.8%; Pred. No. 5.9e-177; 

Matches 558; Conservative 4; Mismatches 2; Indels 1; Gaps 1; 



Qy 


1 


MLRLNLRFLSFLLCISQSVELQAAPSVPTFLTENGLTYCTHASGFSFNPQTADAGTSMNV 
MLRLNLRFLSFLLCIIQSVELQAAPSVPTFLTENGLTYCTHASGFSFNPQTADAGTSMNV 




Db 






Qy 


61 


VTEQIYNKLFDIKNHSATLTPMLAQSYSISADGKEILLNLRHGVKFHQTPWFTPTRDFNA 

1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | 

VTEQIYNKLFDIKNHSATLTPMLAQSYSISADGKEILLNLRHGVKFHQTPWFTPTRDFNA 








120 


Qy 


121 


EDVVFSINRVL-GHNTYLPTLAEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIKSV 
EDVVFSINRVLGGHNTYLPTLAETNVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIKSV 


179 


Db 


121 


180 


Qy 


180 


TALSPYQVKIELFAPDSSILSHLASQYAIIFSQEYAYQLSADDNLAQLDTHPVGTGPYQV 

II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 Ml | | | | 

TALSPYQVKIELFAPDSSILSHLASQYAI IFSQEYAYQLSADDNLAQLDTHPVGTGPYQV 


239 


Db 


181 


240 


Qy 


240 


KDYVYNQYVRLVRNENYWKKEAKIEHIIVDLSTDRSGRLVKFFNNECQIASYPEVSQIGL 
KDYVYNQYVRLVRNENYWKKEAKIEHIIVDLSTDRSGRLVKFFNNECQIASYPEVSQI^ 


299 


Db 


241 


300 


Qy 


300 


LKNDDKHYYMQSTDGMNLAYLAFNFDKPLMRDHEIRAAISQSLNRARIIHSIYHNTATVA 
LKNDDKHYYMQSTDGMNLAYLAFNFDKPLMRDHEIRAAISQSLNRARIIHSIYHNTATVA 


359 


Db 


301 


360 


Qy 


360 


NNIIPEVSWASTVNTPEFEFDYHPKIAKNKLADKNLLLNLWVINEEQVYNPAPFKMAEMI 
NNIIPEVSWASSVNTPEFEFDYNPKIAKNKLADKNLLLNLWVINEEQVYNPAPFKIAEMI 


419 


Db 


361 


420 


Qy 


420 


KWDLAQAGVKVKVRAVTRPFLTAQLRNQSENYDLILSGWLAGNLDPDGFMRPILSCGTKN 

M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

KWDLAQAGVKVKVRAVTRPFLTAQLRNQSENYDLILSGWLAGNLDPDGFMRPILSCGTKN 


479 


Db 


421 


480 


Qy 


480 


ELTNLSNWCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPIIPIANVKRILVANS 
ELTNLSNWCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPIIPIANVKRILVANS 


539 


Db 


481 


540 


Qy 


540 


RVKGVKMTPFGSLDFSTLYFIQEKH 564 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 
RVKGVKMTPFGSLDFSTLYFIQEKY 565 




Db 


541 





RESULT 3 
Q714U3_HAEIN 

ID Q714U3_HAEIM PRELIMINARY; PRT; 540 AA. 
AC Q714U3; 

DT 05-JUL-2004, integrated into UniProtKB/TrEMBL. 

DT 05-JUL-2004, sequence version 1. 

DT 07-FEB-2006, entry version 8. 

DE SapA. 

GN Name=sapA; 
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OS Haemophilus influenzae. 

OC Bacteria; Proteobacteria; Gammaproteobacteria; Pasteurellales; 

OC Pasteurellaceae; Haemophilus. 

OX NCBI_TaxID=727; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RC STRAIN=700222; 

RX MEDLINE=22999298; PubMed=14638817; 

RX DOI=10.1128/IAI. 71. 12. 7202-7207. 2003; 

RA Satola S.W., Schirmer P.L., Farley M.M.; 

RT "Genetic analysis of the capsule locus of Haemophilus influenzae 

RT serotype f."; 

RL Infect. Immun. 71:7202-7207(2003). 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 



DR EMBL; AF549211; AAQ12665.1; -; Genomic_DNA. 

DR GO; GO: 0005215; F: transporter activity; IEA. 

DR GO; GO:0006810; P:transport; IEA. 

DR InterPro; IPR000914; SBP_bac_5. 

DR Pfam; PF00496; SBP_bac 5; 1. 

SQ SEQUENCE 540 AA; 61739 MW; 4BB46B7411611B48 CRC64; 

Query Match 75.6%; Score 2219; DB 2; Length 540; 

Best Local Similarity 76.8%; Pred. No. 1.2e-133; 

Matches 414; Conservative 62; Mismatches 61; Indels 2; Gaps 2; 



Qy 1 MLRLNLRFLSFLLCISQSVEL-QAAPSVPTFLTENGLTYCTHASGFSFNPQTADAGTSMN 59 

Db 1 MLHRNVT F-C FLLCGLSLINLAQAAPRI PKMLTENGLTYCTNASGFS FN PQTADAGTSMN 59 

Qy 60 VVTEQI YNKLFDI KNHS ATLT PMLAQS YS I SADGKEI LLNLRHGVKFHQTPWFTPTRDFN 119 

Db 60 VVTEQIYNKLFDMKDHSAALVPVLAQSYSISSDGKQILINLRQGVKFHRTPWFTSTREFN 119 

Qy 12 0 AEDVVFSINRVLGHNTYLPTLAEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIKSV 179 

Db 120 AEDVVFSINRVLGHDTYLPTLSDDVVTYKNPQYRIFHEQAKKVHFPYFESIKLNQKIKSI 179 

Qy 180 TALSPYQVKIELFAPDSSILSHLASQYAIIFSQEYAYQLSADDNLAQLDTHPVGTGPYQV 239 

II : I I II I I I I I I I : I I I I I I I I I I : I I I I I I I I I M I I I I I I: I I I I I I I I I I I I I I 

Db 180 TATNPYQVKIELFEPDASILSHLASQYSI IFSQEYAYQLSADDNLSQLDTHPVGTGPYQV 239 

Qy 240 KDYVYNQYVRLVRNENYWKKEAKIEHIIVDLSTDRSGRLVKFFNNECQIASYPEVSQIGL 299 

Db 240 kdyvynqyvrlirneeywkkeakikniivdlsaersgrlikffnnecqiaLpeisqlgl 299 

Qy 300 LKNDDKHYYMQSTDGMNLAYLAFNFDKPLMRDHEIRAAISQSLNRARIIHSIYHNTATVA 359 

Db 300 LSEKNASYYLQSTEGMNLAYLAFNFQKSLMQDKTIRQAISQSLNRFRIVRNIYHNTATVA 359 

Qy 360 NNIIPEVSWASTVNTPEFEFDYHPKIAKNKLADKNLLLNLWVINEEQVYNPAPFKMAEMI 419 

Db 360 NNIIPDISWASAINTPDFTYDYQPSKAEKILRDKKLALKMWVINEEQVYNPAPIKMAELI 419 

Qy 4 20 KWDLAQAGVKVKVRAVTRPFLTAQLRNQSENYDLILSGWLAGNLDPDGFMRPILSCGTKN 479 

Db 420 KWDLAKVGVDVKVRSVTRTFLTEQLRNHTEDYDLILTGWLAGNLDPDGFMRPILSCDTQN 479 

Qy 480 ELTNLSNWCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPIIPIANVKRILVAN 538 



480 EITNLSNWCNPEFDKMMDRALSTNHLYERSKAYNSAQELILNELPIVPIANVQRLLVAS 538 



RESULT 4 
Q9CMC1_PASMU 

ID Q9CMC1_PASMU PRELIMINARY; PRT; 563 AA. 
AC Q9CMC1; 

DT 01-JUN-2001, integrated into UniProtKB/TrEMBL. 
DT 01-JUN-2001, sequence version 1. 
DT 07-FEB-2006, entry version 14. 
DE SapA. 

GN Name=sapA; OrderedLocusNames=PM0911; 
OS Pasteurella multocida. 

OC Bacteria; Proteobacteria; Gammaproteobacteria; Pasteurellales; 
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OC Pasteurellaceae; Pasteurella. 

OX NCBI_TaxID=74 7; 

RN [1] 

RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA1 . 

RC STRAIN=Pm70; 

RX MEDLINE=21145866; PubMed=11248100; DOI=10 . 1073/pnas . 051 634 598 ; 

RA May B.J., Zhang Q. , Li L.L., Paustian M.L., Whittam T.S., Kapur V ; 

RT "Complete genomic sequence of Pasteurella multocida Pm70."; 

RL Proc. Natl. Acad. Sci. U.S.A. 98:3460-3465(2001). 



Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 
Distributed under the Creative Commons Attribution-NoDerivs License 



DR EMBL; AE006130; AAK02995.1; -; Genomic DNA. 

DR HSSP; P23847; 1DPE. - 

DR BioCyc; PMUL7 47 : PM0911 -MONOMER; -. 

DR GO; GO: 0005215; F: transporter activity; IEA. 

DR GO; GO: 0006810; P:transport; IEA. 

DR InterPro; IPR000914; SBP_bac_5. 

DR Pfam; PF00496; SBP_bac_5; 1. 

KW Complete proteome. 

SQ SEQUENCE 563 AA; 64532 MW; 9F143828AC2C8306 CRC64; 

Query Match 66.9%; Score 1965; DB 2; Length 563; 

Best Local Similarity 65.7%; Pred. No. 2.4e-117; 

Matches 366; Conservative 89; Mismatches 100; Indels 2; Gaps 



Qy 1 MLRLNLRFLSFLLCISQSVELQAAPSVPTFLTENGLTYCTHASGFSFNPQTADAGTSMNV 60 

Db 1 MLIRKVIFACFLFLYSHFV--TAAPRVPNELTQNGLIYCTHATGFSFNPQTADAGTSMNV 58 

Q y 61 VTEQIYNKLFDIKNHSATLTPMLAQSYSISADGKEILLNLRHGVKFHQTPWFTPTRDFNA 120 

: 1 1 1 I I I I I I : : : I I I : I I I : I I : I : I | : | 

Db 59 ITE Q I ™KLFETSDNSATVIPSLAESYRVSDNGTLITINLRKGVKFHHTEWFTPTRDFNA 118 

Qy 121 EDWFSINRVLGHNTYLPTLAEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIKSVT 180 

Db 119 DDVV FSINRMLGYNSYLPTLDDESIHYSNPQYRIFHKQAKKiRFPYFESIKLNQKIKsiK 178 

Qy 181 ALSPYQVKIELFAPDSSILSHLASQYAIIFSQEYAYQLSADDNLAQLDTHPVGTGPYQVK 240 

I : : M I I : I : I I I : I I I II I I I ! I I I I I I M I I I I : I I I I I III I I I I I I I : I : 
Db 179 AITPYQVQIKLFQADASILSHLASQYAIIFSQEYALQLNADDNLVQLDLLPVGTGPYKVQ 238 

Qy 241 DYVYNQYVRLVRNENYWKKEAKIEHIIVDLSTDRSGRLVKFFNNECQIASYPEVSQIGLL 300 

: I I I I I I : I I I : I I I I I : I :: I I : I I I I I I : I I I I I I I I I I I I I I I I || : I I I 
Db 239 NYFRNQYVRFIRNEHYWKKPAQIKNIIIDLSTDRTGRLVKFLNGECQIVSYPEVSQLGLL 298 

Q y 30i KNDDKHYYMQSTDGMNLAYLAFNFDKPLMRDHEIRAAISQSLNRARIIHSIYHNTATVAN 360 

:: - I : I : : I I I I : I I I I I I II I : : : | I I I | : : : | | | : : I M : || I I I I 
Db 299 QDKNEHFYVDFVEGMNLSYLAFNFKKPAMKSMKLRRAISQAIDRHRI VQTIYHHTATVAN 358 

Qy 361 NIIPEVSWASTVNTPEFEFDYHPKIAKNKLADKNLLLNLWVINEEQVYNPAPFKMAEMIK 420 

Mil : I I I I 1111:1 : I I I : I : I | | | | : I I I I I t I I I I I : I I I ! I: I I 
Db 359 NIIPSISWASKVNTPDFAYDYAPEKARAFLQDKQLQLTMWVINEEQVYNPSPLKMAELIK 418 

Qy 421 WDLAQAGVKVKVRAVTRPFLTAQLRNQSENYDLILSGWLAGNLDPDGFMRPILSCGTKNE 480 

1 I I II I I I : III : I : | : I I : I I : I I : I I I II I I I I I I I I I I I I I I I 
Db 419 RDLANVGVKVIVQPVTRTYLIERLKAHSEDYDMILAGWLAGNLDPDSFMRPILSCNTVTE 478 

Qy 48 1 LTNLSNWCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPIIPIANVKRILVANSR 540 

= 11 Mil: II III I: |::| II || Mil: |:|:| :|| : 

Db 479 ITNFSNWCDPLFDHFMDNALNTTNLHLRASEYNLAQELILSEVPLIPIANAKRMLVVSPN 538 

Qy 541 VKGVKMTPFGSLDFSTL 557 

I : I I I I : I I I I : : I I 
Db 539 VQGVKMSPFGSINFENL 555 



RESULT 5 
Q65U97_MANSM 

ID Q65U97_MANSM PRELIMINARY; PRT; 567 AA. 
AC Q65U97; 

DT 25-OCT-2004, integrated into UniProtKB/TrEMBL. 
DT 25-OCT-2004, sequence version 1. 
DT 07-FEB-2006, entry version 10. 
DE OppA protein. 
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GN Name=oppA; OrderedLocusNames=MS0856; 

OS Mannheimia succiniciproducens (strain MBEL55E) . 

OC Bacteria; Proteobacteria; Gammaproteobacteria; Pasteurellales; 

OC Pasteurellaceae; Mannheimia. 

OX NCBI_TaxID=221988; 

RN [1] 

RP NUCLEOTIDE SEQUENCE [ LARGE SCALE GENOMIC DNA] . 

RX PubMed=15378067; DOI=10 . 1038/nbt 1010 ; 

RA Hong S.H., Kim J.S., Lee S.Y., In Y.H., Choi S.S., Rih J.-K., 



nui i_.n., jeong m., Hur C.G., Kim J.J.; 
RT "The genome sequence of the capnophilic rumen bacterium Mannheimia 
RT succiniciproducens . " ; 
RL Nat. Biotechnol. 22:1275-1281(2004). 

CC 



CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 
CC Distributed under the Creative Commons Attr ibution-NoDerivs License 



DR EMBL; AE016827; AAU37463.1; - ; Genomic_DNA. 

DR GO; GO: 0005215; F: transporter activity; IEA. 

DR GO; GO:0006810; P:transport; IEA. 

DR InterPro; IPR000914; SBP_bac_5. 

DR Pfam; PF00496; SBP_bac_5; 1. 

KW Complete proteome. 

SQ SEQUENCE 567 AA; 64687 MW; 6DA1B590A970B46D CRC64; 

Query Match 64.4%; Score 1892.5; DB 2; Length 567; 

Best Local Similarity 64.9%; Pred. No. l.le-112; 



Matches 362; Conservative 84; Mismatches 103; Indels 9; Gaps 4; 

Qy 8 ELSFLLCISQSVELQ-AAPSVPTFLTENGLTYCTHASGFSFNPQTADAGTSMNVVTEQIY 66 

1 : 1 I 1 : III II I I : I I I I I I I : I I I I I I I I I I I I I I I I I I : I I I I I 

Db 8 FIGFLLFSAMLPFFSWAAPRVPEILTQNGLIYCTHSSGFSFNPQTADAGTSMNVITEQIY 67 

Qy 67 NKLFDIKNHSATLTPMLAQSYSISADGKEILLNLRHGVKFHQTPWFTPTRDFNAEDVVFS 126 

Db 68 NKLFEIKNNSSRLEPSLAQSYKISEDGKTITVYLRKGVEFHHTPWFTPSRNFNADDVVYS 127 

Qy 127 INRVLGHNTYLPTLAEANVTYSNP QYRVFHEQARKVRFPYFDSIKLNEKIKSVTAL 182 

Db 128 LNRVLGHNTSLP EFNASEQQKGMKRQYNIFHELAKKTRFPYFDSIKLNQKIESVTAL 184 

Qy 183 SPYQVKIELFAPDSSILSHLASQYAIIFSQEYAYQLSADDNLAQLDTHPVGTGPYQVKDY 242 

Db 185 DPYTVQINLFAPDASILSHLASQYAI IFSHEYALQLNADDNLAQLDLLPVGTGPYQVKNY 244 

Qy 243 VYNQYVRLVRNENYWKKEAKIEHIIVDLSTDRSGRLVKFFNNECQIASYPEVSQIGLLKN 302 

I I I I I I = I : I I I I I I I I : I : : I | : | | | I I : I I I I I I I I I I I I I :: I : I I I : I I I : 
Db 245 FRNQYVRLIRHENYWKKEAEIKNI IIDLSPDRTGRLAKFFNNECQIAAFPDVSQLGLLQE 304 

Qy 303 DDKHYYMQSTDGMNLAYLAFNFDKPLMRDHEIRAAISQSLNRARIIHSIYHNTATVANNI 362 

Db 305 NGERFQTTLSDGMNLAFLAFNFKRPLMQDAEIRRGIAQAINRHRIIKDIYYNTASVANKI 364 

Qy 3 63 IPEVSWA-STVNTPEFEFDYHPKIAKNKLADKNLLLNLWVINEEQVYNPAPFKMAEMIKW 421 

I = I I : I I :: I I : I I I : I I I : : I I 

Db 365 I PSVSWAGSDSNNHSFAYDYDPAQAKKVLQDRQLSLDMWVLKEEQLYNPSPIKMAELIKH 424 

Qy 422 DLAQAGVKVKVRAVTRPFLTAQLRNQSENYDLILSGWLAGNLDPDGFMRPILSCGTKNEL 481 

Db 4 25 DLTKAGIEVKVRLISRNFLMEQLRNNSENYDLILGGWLAVSLDPDSFMRPILSCGTTSEI 484 

Qy 482 TNLSNWCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPIIPIANVKRILVANSRV 541 

Db 485 TNLSNWCSQSFEEILDRALISNSTNERAVNYHLAEQEVLSELPILPIASVKRILISNSNV 544 

Qy 542 KGVKMTPFGSLDFSTLYF 559 

:ll:|:||||: I I I 
Db 545 QGVEMS PFGSISFEKLSF 562 



RESULT 6 
Q3EG24_ACTSC 

ID Q3EG24_ACTSC PRELIMINARY; PRT; 561 AA. 
AC Q3EG24; 

DT 08-NOV-2005, integrated into UniProtKB/TrEMBL. 
DT 08-NOV-2005, sequence version 1. 
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DT 07-FEB-2006, entry version 3. 

DE ABC-type dipeptide/oligopeptide/nickel transport systems, periplasmic 

DE components precursor. 

GN ORFNames=AsucDRAFT_054 3; 

OS Actinobacillus succinogenes 130Z. 

OC Bacteria; Proteobacteria; Gammaproteobacteria; Pasteurellales ; 

OC Pasteurellaceae; Actinobacillus. 

OX NCBI_TaxID=339671; 

RN [1J 

RP NUCLEOTIDE SEQUENCE. 

RC STRAIN=130Z; 

RG US DOE Joint Genome Institute ( JGI-PGF) ; 

RA Copeland A., Lucas S., Lapidus A., Barry K . , Detter J.C., Glavina T . , 

RA Hammon N . , Israni S., Pitluck S., Richardson P.; 

RT "Sequencing of the draft genome and assembly of Actinobacillus 

RT succinogenes 130Z."; 

RL Submitted (SEP-2005) to the EMBL/GenBank/DDBJ databases. 

RN [2] 

RP NUCLEOTIDE SEQUENCE. 

RC STRAIN=130Z; 

RG US DOE Joint Genome Institute (JGI-ORNL) ; 

RA Larimer F. , Land M. ; 

RT "Annotation of the draft genome assembly of Actinobacillus 

RT succinogenes 130Z."; 

RL Submitted (SEP-2005) to the EMBL/GenBank/DDBJ databases. 

CC -!- CAUTION: The sequence shown here is derived from an 

CC EMBL/GenBank/DDBJ whole genome shotgun (WGS) entry which is 

CC preliminary data. 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; AAKC01000026; EAO50273.1; -; Genomic DNA. 

DR GO; GO: 0005215; F : transporter activity; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR000914; SBP_bac_5. 

DR Pfam; PF00496; SBP_bac_5; 1. 

DR PROSITE; PS01040; SBP_BACTERIAL 5; 1. 

KW Signal. 

FT SIGNAL l 21 Potential. 

FT SIGNAL 561 561 Potential. 

SQ SEQUENCE 561 AA; 63813 MW; 73AED17987DB7057 CRC64; 

Query Match 59.7%; Score 1752.5; DB 2; Length 561; 
Best Local Similarity 59.1%; Pred. No. le-103; 

Matches 335; Conservative 97; Mismatches 124; Indels 11; Gaps 5; 

Qy 1 MLRLNLRFLSFLLCISQSVELQAAPSVPTFLTENGLTYCTHASGFSFNPQTADAGTSMNV 60 

°t> 1 MLKISVTLLFFLI---TSFSVLSAPRVPAELTDNGLIYCTHATGFSFNPQTADAGTSMNV 57 

Qy 61 VTEQIYNKLFDIKNHSATLTPMLAQSYSISADGKEILLNLRHGVKFHQTPWFTPTRDFNA 120 

Db 58 VTEQIYNKLFEIKANSSQVEPSLARSYKISSDGKTITLYLRRGVKFHHTPWFTPSRNFNA 117 

Qy 121 EDVVFSINRVLGKNTYLPTL AEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIK 177 

= 11111:11111111 II I I : | || : | | : | : | I I I I | : I | | I I : I I 

Db 118 DDVVFSLNRVLGHNTSLPEFELETEQNIV— NRQYSIFHDLAKKTRFPYFESIKLNQKIN 175 

Qy 178 SVTALSPYQVKIELFAPDSSILSHLASQYAIIFSQEYAYQLSADDNLAQLDTHPVGTGPY 237 

1 :: II 1 = 1 H H ; l 11111:1 I:|l!ll II 

Db 176 YVESVDPYTVQIHLFEPDASILSHLASQYAVIFSHEYALQLNADDNLEQLDTLPVGTGAY 235 

Qy 238 QVKDYVYNQYVRLVRNENYWKKEAKIEHIIVDLSTDRSGRLVKFFNNECQIASYPEVSQI 297 

I : I = I : I I I I I : I : I I : I Ml : I : : I I I I I : II: I I I I I I I I I I :: I I I I I : 

Db 236 QLKEYLRGQYVRLMPNQYYWRKPAKIANI VIDLSTDKIGRMAKFFNNECQIAAFPEVSQL 295 

Qy 298 GLLKNDDKHYYMQSTDGMNLAYLAFNFDKPLMRDHEIRAAISQSLNRARIIHSIYHNTAT 357 

Db 296 GLLQQSGAQFKTTIAEGMNLSFLAFNFLRPTMRNTELRRAIALAINRERLIKHIYYDTAV 355 

Qy 358 VANNIIPEVSWASTVNTPEFEFDYHPKIAKNKLADKNL-LLNLWVINEEQVYNPAPFKMA 416 

mini mi: in n i= n . i : | :: m 

°b 356 VANNIIPAISWAA--GNEVSHFDYDPKKAREMLADMQIPPLEMWLVQEEQVFNPAPIKMA 413 

QY 417 EMIKWDLAQAGVKVKVRAVTRPFLTAQLRNQSENYDLILSGWLAGNLDPDGFMRPILSCG 476 
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Db 414 EMIRTDLNAVGLNVKVRLISRNFLMENLHNKTEDYDLILAGWLASSLDPDSFLRPILSCD 473 

Qy 477 TKNELTNLSNWCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPIIPIANVKRILV 536 

Db 474 TTDEVSNVSNWCSESFDQLLDSALIRSDPHARAVDYAVAQQQVFSELPILPLANVKRILI 533 

Qy 537 ANSRVKGVKMTPFGSLDFSTLYFIQEK 563 

Db 534 SNTRVDGIEVTPFGNIHFEKLSLKKEK 560 

RESULT 7 
Q7VM01_HAEDU 

ID Q7VM01_HAEDU PRELIMINARY; PRT; 560 AA. 

AC Q7VM01; 

DT 01-OCT-2003, integrated into UniProtKB/TrEMBL. 

DT 01-OCT-2003, sequence version 1. 

DT 07-FEB-2006, entry version 13. 

DE Peptide transport periplasmic protein SapA. 

GN Name=sapA; OrderedLocusNames=HD1230; ORFNames=HD_1230; 

OS Haemophilus ducreyi. 

OC Bacteria; Proteobacteria; Garamaproteobacteria; Pasteurellales; 

OC Pasteurellaceae; Haemophilus. 

OX NCBI_TaxID=730; 

RN [1] 

RP NUCLEOTIDE SEQUENCE [ LARGE SCALE GENOMIC DNA] . 

RC STRAIN=35000HP / ATCC 700724; 

RA Munson R.S. Jr., Ray W.C., Mahairas G., Sabo P., Mungur R. , 

RA Johnson L. ( Nguyen D., Wang J., Forst C, Hood L.; 

RT "The complete genome sequence of Haemophilus ducreyi."; 

RL Submitted (JUN-2003) to the EMBL/GenBank/DDBJ databases 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attr ibution-NoDer ivs License 

CC 

DR EMBL; AE017143; AAP96068.1; -; Genomic DNA. 

DR HSSP; P23847; 1DPE. 

DR GO; GO: 0005215; F: transporter activity; IEA. 

DR GO; GO: 0006810; P: transport; IEA. 

DR InterPro; IPR000914; SBP_bac 5. 

DR Pfam; PF00496; SBP_bac_5; l.~ 

KW Complete proteome. 

SQ SEQUENCE 560 AA; 63809 MW; 9EE7FD98914355C8 CRC64; 

Query Match 4 3.2%; Score 1267.5; DB 2; Length 560; 

Best Local Similarity 44.8%; Pred. No. 1.2e-72; 

Matches 251; Conservative 110; Mismatches 188; Indels 11; Gaps 4; 

QY 6 LRFLSFLLCISQSVELQAAPSVPTFLTENGLTYCTHASGFSFNPQTADAGTSMNVVTEQI 65 

1:1 1 : I ! I I I : : I I I I I I I I I I I I I I I : I I I i ! I I I 

Db 9 LKFSPFFAVFCWISTAYSAPRVPKELSADSLIYCTSISGLSFNPQKADVGTNMNVVTEQI 68 

QY 6 6 YNKLFDIKNHSATLTPMLAQSYSISADGKEILLNLRHGVKFHQTPWFTPTRDFNAEDWF 125 

Db 69 YDKLFEIDRHTHRVIPSLAETFSVSDDGKEITLNLRRQVAFHKTPWFTPTRLFNAEDWF 128 

Qy 126 SINRVLGHNTYLPTL AEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIKSVTAL 182 

I : I I : : I : III : : : | : | : | : I I I : I : I : I I : : | 

Db 129 SLNRMIGNVEELPALDFNEDSKEQFQQNQRYAYHFKANLAHYPYFESVALKKKIAKISAP 188 

QY !83 SPYQVKIELFAPDSSILSHLASQYAI IFSQEYAYQLSADDNLAQLDTHPVGTGPYQVKDY 242 

Db 189 NEYTVKIHLVAPDNSVLAHLASQYAVILSKEYALLLNADENLAQLDLLPVGTGVYQLSDY 248 

QY 243 VYNQYVRLVRNENYWKKEAKIEHIIVDLSTDRSGRLVKFFNNECQIASYPEVSQIGLLKN 302 

: I : I I I I I I I : = I I I : : : I I I : : : I I : I : I I I I : I I | | : : : 
Db 249 IQNEYVRLKPNPVYWGEKAKINNWVDFSSNSTGRMAKYLNQECDIVAQPEPSQRRVISS 308 

QY 303 DDKHYYMQSTDGMNLAYLAFNFDKPLMRDHEIRAAISQSLNRARIIHSIYHNTATVANNI 362 

Db 309 YEIVESPGANLAFLAFNMQKEKMQDIAFRRQIAQAINRERLVKALFYGSAEVADNV 364 

QY 36 3 I PEVSWASTVNTPEFEFDYHPKIAKNKLADKNLLLNLWVINEEQVYNPAPFKMAEMIKWD 422 

: ' : 1 I : : HI I = I I I :: I : I I I I I I I I I I : I 
Db 365 LPSALFAQK-NPAAYPYKAPQPRAKNAKLDR LIFWVLDESRVYNLHPLKMAEMIRND 420 
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Qy 423 kAQAGVKVKVRAVTRPFLTAQLRNQSENYDLILSGWLAGNLDPDGFMRPILSCGTKNELT 4 82 



1 : = I : I I : I = : I I I I I : I I I I I I I I : I : I I I I I | : I : : | 

Db 421 LKK ™IDVIIRPVSRAKVVQLAAAGKADYDLILTGWLANNLDPNAFLSPILSCRTQNKVT 480 

Qy 483 NLSNWCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPIIPIANVKRILVANSRVK 542 

Db 4 81 NLANWCHQQFDEWLEIAKANQVPYVRNMI YKQTQALLEEQLPILPLLHAQRSLFVNQKIK 540 

Qy 543 GVKMTPFGSLDFSTLYFIQE 562 

Db 541 NAHIEPFGQVRLSELTLHQE 560 

RESULT 8 
Q32FZ3_SHIDS 

ID Q32FZ3_SHIDS PRELIMINARY; PRT; 547 AA. 

AC Q32FZ3; 

DT 06-DEC-200S, integrated into UniProtKB/TrEMBL. 

DT 06-DEC-2005, sequence version 1. 

DT 07-FEB-2006, entry version 4. 

DE SapA. 

GN Name=sapA; OrderedLocusNames=SDY_l 638 ; 

OS Shigella dysenteriae serotype 1 (strain Sdl97) . 

OC Bacteria; Proteobacteria ; Gammaproteobacteria; Enterobacterial; 

OC Enterobacteriaceae; Shigella. 

OX NCBI_TaxID=3002 67; 
RN [1] 

RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA] . 

RX PubMed=16275786; DOI=10 . 1093/nar/gki954 ; 

RA Yang F. , Yang J., Zhang X., Chen L., Jiang Y . , Yan Y. ( Tang X., 

RA Wang J., Xiong Z., Dong J., Xue Y., Zhu Y., Xu X., Sun L . , Chen S., 

RA Nie H., Peng J., Xu J., Wang Y., Yuan Z., Wen Y., Yao Z., Shen Y., 

RA Qiang B . , Hou Y., Yu J., Jin Q. ; 

RT "Genome dynamics and diversity of Shigella species, the etiologic 

RT agents of bacillary dysentery."; 

RL Nucleic Acids Res. 33:6445-6458(2005). 

cc 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; CP000034; ABB61762.1; -; Genomic_DNA. 

DR GO; GO: 0005215; F: transporter activity; IEA. 

DR GO; GO:0006810; P:transport; IEA. 

KW Complete proteome . 

SQ SEQUENCE 547 AA; 61541 MW; 01AFB93E4B790FF3 CRC64; 

Query Match 35.1%; Score 1030.5; DB 2; Length 547; 

Best Local Similarity 37.7%; Pred. No. 1.8e-57; 

Matches 210; Conservative 117; Mismatches 199; Indels 31; Gaps 9; 
QY 9 LSFLLCISQSVELQ--AAPSVPTF--LTENGLTYCTHASGFSFNPQTADAGTSMNVVTEQ 64 

Db 5 LSSLLVIAGLVSGQAIAAFKSPPHADIRDSGFVYCVSGQVNTFNPSKASSGLIVDTLAAQ 64 

Qy 65 IYNKLFDIKNHSATLTPMLAQSYSISADGKEILLNLRHGVKFHQT PWFTPTRDFNAEDVV 124 

Db 65 FYDRLLDVDPYTYRLIPELAESWEVLDNGATYRFHLRRDVPFQKTDWFTPTRKMNADDVV 124 

Qy 125 FSINRVLGHNTYLPTLAEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIKSVTALSP 184 



Db 125 FTFQRIFDHN NPWHNV NGSNFPYFDSLQFADNVKSIRKLDN 165 

QY 185 YQVKIELFAPDSSILSHLASQYAIIFSQEYAYQLSADDNLAQLDTHPVGTGPYQVKDYVY 244 

= ' : I I I = I I I I I : I I : I I I I : I : I I I I I I I I I I I I: :| 
Db 166 HTVEFRLAQPDASFLWHLATHYASVMSAEYARKLEKEDRQEQLDRQPVGTGPYQLSEYRA 225 

Qy 245 NQYVRLVRNENYWKKEAKI EHI IVDLSTDRSGRLVKFFNNECQIAS YPEVSQIGLLKNDD 304 

l-M : : ::|ll : : I : : : I II: :|: || 

Db 226 GQFIRLQRHDDFWRGKPLMPQWVDLGSGGTGRLSKLLTGECDVLAWPAASQLSILR-DD 284 

Qy 305 KHYYMQSTDGMNLAYLAFNFDKPLMRDHEIRAAISQSLNRARIIHSI YHNTATVANNIIP 364 

= 111:111111 II : : :| I:: ::| I:: III: || | :|:| 
Db 2 85 PRLRLTLRPGMNVAYLAFNTAKPPLNNPAVRHALALAINNQRLMQSIYYGTAETAASILP 344 

QY 365 EVSWASTVNTPEFEFDYHPKIAKNKLAD— - KNLLLNLWVINEEQVYNPAPFKMAEMIKW 421 
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Db 345 RASWAYD-NEAKIT-EYNPAKSREQLKSLGLENLTLKLWVPTRSQAWNPSPLKTAELIQA 402 

Qy 422 DLAQAGVKVKVRAVTRPFLTAQLRNQSENYDLILSGWLAGNLDPDGFMRPILSCGTKNEL 481 

I : I I I I I I : I I I : I : I : I I I I I I : I I I I I I : I I I : 

Db 403 DMAQVGVKVVIVPVEGRFQEARLMDMS--HDLTLSGWATDSNDPDSFFRPLLSCAAIHSQ 460 

Qy 482 TNLSNWCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPI IPIANVKRILVANSRV 541 

Db 4 61 TNLAHWCDPKFDSVLRKALSSQQLAARIEAYDEAQSILAQELPILPLASSLRLQAYRYDI 520 

Qy 542 KGVKMTPFGSLDFSTLY 558 

I I : : : I I I : I : : I 

Db 521 KGLVLS PFGNAS FAGVY 537 

RESULT 9 
Q83RL7_SHIFL 

ID Q83RL7_SHIFL PRELIMINARY; PRT; 547 AA. 

AC Q83RL7; 

DT 01-JUN-2003, integrated into UniProtKB/TrEMBL . 

DT 01-JUN-2003, sequence version 1. 

DT 07-FEB-2006, entry version 15. 

DE Peptide transport periplasmic protein. 

GN Name=sapA; OrderedLocusNames=SF1299; 

OS Shigella flexneri. 

OC Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; 

OC Enterobacteriaceae; Shigella. 

OX NCBI_TaxID=623; 

RN [1] 

RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA] . 

RC STRAIN=301 / Serotype 2a; 

RX MEDLINE=22272406; PubMed=12384590; DOI=10 . 1093/nar/gkf 566; 

RA Jin Q. , Yuan Z. ( Xu J., Wang Y . , Shen Y., Lu W., Wang J., Liu H., 

RA Yang J., Yang F. , Zhang X., Zhang J., Yang G., Wu H . , Qu D. , Dong J., 

RA Sun L . , Xue Y., Zhao A., Gao Y . , Zhu J., Kan B., Ding K. , Chen S., 

RA Cheng H., Yao Z., He B., Chen R. , Ma D., Qiang B., Wen Y. ( Hou Y. ( 

RA Yu J.; 

RT "Genome sequence of Shigella flexneri 2a: insights into pathogenicity 

RT through comparison with genomes of Escherichia coli K12 and 0157."; 

RL Nucleic Acids Res. 30:4432-4441(2002). 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; AE005674; AAN42910.1; -; Genomic_DNA. 

DR HSSP; P23847; 1DPE. 

DR BioCyc; SFLE198214 : AAN42910 . 1 -MONOMER; -. 

DR GO; GO:0005215; F: transporter activity; IEA. 

DR GO; GO:0006810; P:transport; IEA. 

DR InterPro; IPR000914; SBP_bac_5. 

DR Pfam; PF00496; SBP_bac_5; 1. 

DR PROSITE; PS01040; SBP_BACTERIAL_5 ; 1. 

KW Complete proteome. 

SQ SEQUENCE 547 AA; 61548 MW; 8955A6DE58731D48 CRC64; 

Query Match 35.0%; Score 1027.5; DB 2; Length 547; 
Best Local Similarity 37.8%; Pred. No. 2.7e-57; 

Matches 213; Conservative 112; Mismatches 194; Indels 45; Gaps 9; 

Qy 9 LSFLLCISQSVELQ--AAPSVPTF — LTENGLTYCTHASGFSFNPQTADAGTSMNVVTEQ 64 

Db 5 LSSLLVIAGLVSGQAIAAPESPPHADIRDSGFVYCVSGQVNTFNPSKASSGLIVDTLAAQ 64 

Qy 65 IYNKLFDIKNHSATLTPMLAQSYSISADGKEILLNLRHGVKFHQTPWFTPTRDFNAEDVV 124 

I : : I I : :: I I I I : I : : : I : I I I I : I I I I I I I I I : I I I 

Db 65 FYDRLLDVDPYTYRLMPELAESWEVLDNGATYRFHLRRDVPFQKTDWFTPTRKMNADDVV 124 

Qy 125 FSINRVLGHNTYLPTLAEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIKSVTALSP 184 

I : I : I I I : I I I I I I I : : : : I I I I 

Db 125 FTFQRIFDRN NPWHNV NGSNFPYFDSLQFADNVKSVRKLDN 165 

Qy 185 YQVKIELFAPDSSILSHLASQYAIIFSQEYAYQLSADDNLAQLDTHPVGTGPYQVKDYVY 244 

Db 166 HTVEFRLAQPDASFLWHLATHYASVMSAEYARKLEKEDRQEQLDRQPVGTGPYQLSEYRA 225 
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Qy 245 NQYVRLVRNENYWKKEAKIEHIIVDLSTDRSGRLVKFFNNECQIASYPEVSQIGLLKNDD 304 

Db 226 GQFIRLQRHDDFWRGKPLMPQVVVDLGSGGTGRLSKLLTGEC DVLAWPAASQLSILR-DD 284 

Qy 305 KHYYMQSTDGMNLAYLAFNFDKPLMRDHEIRAAISQSLNRARIIHSI YHNTATVANNIIP 364 

: 111:111111 II:: : I I : : : : I I : : I I I : I I I : I : I 

Db 285 PRLRLTLRPGMNVAYLAFNTAKPPLNNPAVRHALALAINNQRLMQSI YYGTAETAASILP 344 

Qy 365 EVSWASTVNTPEFEFDYHPKI AKNK LADKNLLLNLWVINEEQVYNPAPFK 414 

III = I II II:: I : I I I I I I I : I I : I I 

Db 345 RASWA YDNEAKITEYNPAKSREQLKALGLENLTLKLWVPTRSQAWNPSPLK 395 

Qy 415 MAEMIKWDLAQAGVKVKVRAVTRPFLTAQLRNQSENYDLILSGWLAGNLDPDGFMRPILS 474 

Db 396 TAELIQADMAQVGVKVVIVPVEGRFQEARLMDMS--HDLTLSGWATDSNDPDSFFRPLLS 4 53 

Qy 475 CGTKNELTNLSNWCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPIIPIANVKRI 534 

Db 454 CAAIHSQTNLAHWCNPKFDSVLRKALSSQQLAARIEAYDEAQSILAQELPILPLASSLRL 513 

Qy 535 LVANSRVKGVKMTPFGSLDFSTLY 558 

: I I : : : I I I : I : : I 
Db 514 QAYRYDIKGLVLSPFGNASFAGVY 537 

RESULT 10 
Q8CW41_ECOL6 

ID Q8CW41_ECOL6 PRELIMINARY; PRT; 547 AA . 

AC Q8CW41; 

DT 01-MAR-2003, integrated into UniProtKB/TrEMBL . 

DT 01-MAR-2003, sequence version 1. 

DT 21-FEB-2006, entry version 15. 

DE Peptide transport periplasraic protein sapA. 

GN Name=sapA; ORFNames=c_1771 ; 

OS Escherichia coli 06. 

OC Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; 

OC Enterobacteriaceae; Escherichia. 

OX NCBI_TaxID=217992; 

RN [1] 

RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA] . 

RC STRAIN=06:H1 / CFT073 / ATCC 700928 / UPEC; 

RX MEDLINE=22388234; PubMed=12471157; DOI=10 . 1073/pnas . 252529799 ; 

RA Welch R.A., Burland V., Plunkett G. Ill, Redford P., Roesch P., 

RA Rasko D., Buckles E.L., Liou S.-R., Boutin A., Hackett J., Stroud D. ( 

RA Mayhew G.F., Rose D.J., Zhou S., Schwartz D.C., Perna N.T., 

RA Mobley H.L.T., Donnenberg M.S., Blattner F.R.; 

RT "Extensive mosaic structure revealed by the complete genome sequence 

RT of uropathogenic Escherichia coli."; 

RL Proc. Natl. Acad. Sci . U.S.A. 99:17020-17024(2002). 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attr ibution-NoDer ivs License 

CC 

DR EMBL; AE014075; AAN80237.1; -; Genomic_DNA . 

DR HSSP; P23847; 1DPE. 

DR BioCyc; ECOL199310 : C177 l-MONOMER; -. 

DR GO; GO:0005215; F: transporter activity; IEA. 

DR GO; GO: 0006810; P:transport; IEA. 

DR InterPro; IPR000914; SBP_bac_5. 

DR Pfam; PF00496; SBP_bac_5; 1. 

DR PROSITE; PS01040; SBP_BACTERIAL_5 ; 1. 

KW Complete proteome. 

SQ SEQUENCE 547 AA; 61466 MW; 48A9789B9C9C7947 CRC64; 

Query Match 34.9%; Score 1025.5; DB 2; Length 547; 

Best Local Similarity 37.8%; Pred. No. 3.7e-57; 

Matches 213; Conservative 112; Mismatches 194; Indels 45; Gaps 9; 

Qy 9 LSFLLCISQSVELQ--AAPSVPTF--LTENGLTYCTHASGFSFNPOTADAGTSMNWTEQ 64 

Mill: I I III I : : : I II : I I I I : I : : : I 

Db 5 LSSLLVIAGLVSGQAIAAPESPPHADIRDSGFVYCVSGQVNTFNPSKASSGLIVDTLAAQ 64 

Qy 65 IYNKLFDIKNHSATLTPMLAQSYSISADGKEILLNLRHGVKFHQTPWFTPTRDFNAEDVV 124 

Db 65 FYDRLLDVDPYTYRLMPELAESWEVLDNGATYRFHLRRDVPFQKTDWFTPTRKMNADDVV 124 
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Qy 125 FSINRVLGHNTYLPTLAEANVTYSNPQYRVFHEQARKVRFPYFDSIKLNEKIKSVTALSP 184 

I : I : I I I : | I I I I I I : : : : I I I I 

Db 125 FTFQRIFDRN NPWHNV NGSNFPYFDSLQFADNVKSVRKLDN 165 

QY 185 YQVKIELFAPDSSILSHLASQYAIIFSQEYAYQLSADDNLAQLDTHPVGTGPYQVKDYVY 244 

Db 166 HTVEFRLAQPDASFLWHLATHYASVMSAEYAGKLEKEDRQEQLDRQPVGTGPYQLSEYRA 225 

Qy 245 NQYVRLVRNENYWKKEAKIEHIIVDLSTDRSGRLVKFFNNECQIASYPEVSQIGLLKNDD 304 

Db 226 GQYIRLQRHDDFWRGKPLMPQVVVDLGSGGTGRLSKLLTGEC DVLAWPAASQLSILR-DD 284 

QY 305 KHYYMQSTDGMNLAYLAFWDKPLMRDHEIRAAISQSLNRARIIHSIYHNTATVANNIIP 364 

Db 285 PRLRLTLRPGMNVAYLAFNTAKPPLNNPAVRHALALAINNQRLMQSIYYGTAETAASILP 344 

QY 365 EVSWASTVNTPEFEFDYHPKI AKNK LADKNLLLNLWVINEEQVYNPAPFK 414 

Db 345 RASWA YDNEAKITEYN PAKSREQLKALGLENLTLKLWVPTRSQAWNPSPLK 395 

QY 415 MAEMIKWDLAQAGVKVKVRAVTRPFLTAQLRNQSENYDLILSGWLAGNLDPDGFMRPILS 474 

Db 396 TAELIQADMAQVGVKVVIVPVEGRFQEARLMDMS--HDLTLSGWATDSNDPDSFFRPLLS 453 

Qy 475 CGTKNELTNLSNWCNEEFDQFMDRAITTSHLSSRAKAYNEAQELVLRELPI IPIANVKRI 534 

°b 454 CAAIHSQTNLAHWCDPKFDSVLRKALSSQQLAARIEAYDEAQSILAQELPILPLASSLRL 513 

Qy 535 LVAN S RVKGVKMT P FGS L D FSTLY 558 

Db 514 QAYRY DI KGLVLS P FGNAS FAG V Y 537 

RESULT 11 
SAPA_ECOLI 

ID SAPA_ECOLI STANDARD; PRT; 547 AA. 

AC Q47622; P77358; 

DT 01-NOV-1997, integrated into UniProtKB/Swiss-Prot . 

DT 01-NOV-1996, sequence version 1. 

DT 07-MAR-2006, entry version 38. 

DE Peptide transport periplasmic protein sapA precursor. 

GN Name=sapA; OrderedLocusNames=bl294 ; 

OS Escherichia coli. 

OC Bacteria; Proteobacteria ; Gammaproteobacteria ; Enterobacteriales ; 

OC Enterobacteriaceae; Escherichia. 

OX NCBI_TaxID=562; 

RN [1] 

RP NUCLEOTIDE SEQUENCE [GENOMIC DNA] . 

RC STRAIN=K12 / FRAG 5 ; 

RA Epstein W., Noelker E . , Stumpe S., Tewes R. , Schmid R. , Bakker E.P.; 

RL Submitted (APR-1996) to the EMBL/GenBank/DDB J databases. 

RN [2] 

RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA] . 

RC STRAIN=K12 / MG1655; 

RX MEDLINE=97426617; PubMed=9278503; DOI=10 . 1126/science . 277 . 5331 . 14 53 ; 

RA Blattner F.R., Plunkett G. Ill, Bloch C.A., Perna N.T., Burland V., 

RA Riley M. , Collado-Vides J., Glasner J.D., Rode C.K., Mavhew G.F., 

RA Gregor J., Davis N.W., Kirkpatrick H.A., Goeden M.A., Rose D.J., 

RA Mau B. , Shao Y. ; 

RT "The complete genome sequence of Escherichia coli K-12."; 

RL Science 277:1453-1474 (1997) . 

RN [3] 

RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA] . 

RC STRAIN=K12; 

RX MEDLINE=97251357; PubMed=9097039; DOI=10 . 1093/dnares/3 . 6 . 363 ; 

RA Aiba H., Baba T., Fujita K . , Hayashi K., Inada T., Isono K., Itoh T., 

RA Kasai H., Kashimoto K . , Kimura S., Kitakawa M . , Kitagawa M., 

RA Makino K., Miki T., Mizobuchi K., Mori H., Mori T., Motomura K., 

RA Nakade S., Nakamura Y., Nashimoto H., Nishio Y., Oshima T., Saito N., 

RA Sampei G., Seki Y., Sivasundaram S., Tagami H., Takeda J., 

RA Takemoto K. , Takeuchi Y., Wada C, Yamamoto Y., Horiuchi T.; 

RT "A 570-kb DNA sequence of the Escherichia coli K-12 genome 

RT corresponding to the 28.0-40.1 min region on the linkage map."; 

RL DNA Res. 3:363-377(1996). 

CC -!- FUNCTION: Involved in a peptide intake transport system that plays 

cc a role in the resistance to antimicrobial peptides. 



http://es/ScoreAccessWeb/GetItem.action?AppId=10807746&seqId=972656&ItemName=2... 1/3/2007 



SCORE Search Results Details for Application 10807746 and Search Result 20061218... Page 14 of 14 



-!- INTERACTION : 

P0A6Y8:dnaK; NbExp=l; IntAct=EBI-S49564 , EBI-542092; 
-!- SUBCELLULAR LOCATION: Periplasmic (Probable). 

-!- SIMILARITY: Belongs to the bacterial solute-binding protein 5 
family. 

Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 
Distributed under the Creative Commons Attribution-NoDerivs License 



Genomic_DNA. 
Genomic_DNA. 
Genomic_DNA. 
Genomic_DNA. 



EMBL; X97282; CAA65937.1; 
EMBL; U00096; AAC74376.1; 
EMBL; D90768; BAA14864.1; 
EMBL; D90767; BAA14855.1; 
PIR; A64878; A64878. 
HSSP; P23847; 1DPE. 
GenomeReviews; U00096_GR; bl294. 
EchoBASE; EB4155; -. 
EcoGene; EG20254; sapA. 
BioCyc; EcoCyc : SAPA-MONOMER; -. 
GO; GO:0005515; F:protein binding; IPI 
InterPro; IPR000914; SBP_bac_5. 
Pfam; PF00496; SBP_bac_5; 1. 
PROSITE; PS01040; SBP_BACTERIAL_5; 1. 
Complete proteome 
Signal; Transport 
SIGNAL 1 
CHAIN 22 



Peptide transport; Periplasmic; Protein transport; 

21 Potential. 
547 Peptide transport periplasmic protein 

/FTId=PRO_0000031801 . 
35 SG -> RV (in Ref . 3) . 

61565 MW; EB552BB3B8E102BF CRC64; 
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